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Abstract 

Let S be a denumerable state space and let P be a transition probabil- 
ity matrix on S. If a denumerable set M of nonnegative matrices is such 
that the sum of the matrices is equal to P, then we call M a partition of 
P. 

Let K denote the set of probability vectors on 5*. To every partition 
M of P we can associate a transition probability function Pm on K 
defined in such a way that if p £ K and M £ M axe such that | \pM\\ > 0, 
then, with probability ||pM||, the vector p is transferred to the vector 
pM/||pM||. Here || ■ || denotes the h — norm. 

In this paper we investigate convergence in distribution for Markov 
chains generated by transition probability functions induced by partitions 
of transition probability matrices. 

An important application of the convergence results obtained is to 
filtering processes of partially observed Markov chains. 
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1 Introduction 



1.1. The transition probability function V m - Let S be a denumerable set 
and let P be a transition probability matrix (tr.pr.m) on S. A denumerable set 
A4 = {M(w) : w S W} of nonnegative S x S matrices such that X^gw M(w) = 
P will in this paper be called a partition of P. We denote the set of all tr.pr.ms 

on S by PM(S x S). 

Next, let K denote the set of probability vectors on S, let 1 1 • 1 1 denote the 
li — norm, define a metric S on K by 5(x,y) — \\x — y\\ and let £ denote the 
Borel field induced by 5. Let V(K) denote the set of probability measures on 
(K,£). An element x in K will be considered as a row vector. We denote the 
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i : th coordinate of a generic vector x by (x) j and we denote the i,j — th element 
of a generic matrix M by (M)ij. 

To every partition M = {M (w) : w € W} of Fe PM (5 x 5) one can define 
a transition probability function (tr.pr.f ) Pm : K x £ — > [0, f ] on (if, £) by 

P^(x,5) = ^ ||a;M(«7)||, xeK, B&£ (1) 

weW M {x,B) 

where 

W M (x,B) = {w e W : \\xM{w)\\ > 0,xM(w)/\\xM(w)\\ <= B}. 

That Pth(x, •) is a probability measure for each ir € if is easily proved and that 
PjVtG; -S) is Borel-measurable for each B £ £ can be proved by fairly standard 
arguments. The details will be given in the last section. 

Let C[K] denote the set of real, bounded, continuous function on (K,£). 
From the definition of Pm(x, ■) it is easily seen that 



u(y)P M (x,dy)= ^ u(aM(«;)/||a;M(«;)||)||ajM(«;)|| 

J " wGWm(x) 

where 

W M (x)={weW :\\xM(w)\\>0}. (2) 
Next, let P*vi(-, •) denote the n-step tr.pr.f defined recursively by 

P 1 M (x,B) = P M (x,B), xeK, Be£ 

P n + 1 (x,B) = f P n M (y,B)P M (x 1 dy) x G K, B e £ , n=l,2,.... 
iK 

If the tr.pr.f P^(-, •) is such that there exists a probability measure fx € V(K) 
such that 

lim / u(y)P n M (x,dy) = [ u(y)n(dy), Vu € C[K], Vx E K 

then we say that Pm('i ') is asymptotically stable. 

The main purpose of this paper is to give a sufficient condition for asymp- 
totic stability of P^(-, ■) when the tr.pr.m P on 5* is irreducible, aperiodic and 
positively recurrent. 



1.2. Motivation. The interrelationship with the filtering process. 

Let (5,5) be a measurable space, and let P : 5 x 5 — > [0,1] be a tr.pr.f on 
(5,5). Let (A, -4) be another measurable space and let R : 5 x A — > [0, 1] be a 
tr.pr.f from (5,5) to (A, A). Let "P(5) denote the set of probability measures 
on (5,5). It is well-known ( see e.g [IT] or [23] for the case when 5 and A 
are finite sets) that one to each pg £ V{S) can define two stochastic processes 
{Xq , X\ , X2 , X3 , ... } and {Yj. , Y2, 13, ... } taking values in 5 and A respectively, 
such that 

1) {X n , n = 0, 1, 2, ...} is a Markov chain with tr.pr.f P and initial distribution 

Po; 

2) for n = 1,2,3,... and D e 5 

Pr[X„ + i e £>|X = i ,Xx = ix,Yt = a 1: X 2 = i 2 ,Y 2 = a 2 ,... ,X n =i,Y n = a n ] - 
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Pr[X n+l eD\X n = i] =P(i,D) 
3) for n= 1,2,3,... and B e A 

Pr[Y n e B\X = i ,Xx = i 1} Yi = ai, Jf 2 = £2,3-2 = 02, ••■ ,X n = i] = 

Pr[Y n eB\X n = i]=R(i,B). 

The Markov chain {X n } is often called the hidden process, whereas the process 
{Y n } is often called the observation process. 

Next let Z n , n = 1, 2, ... denote the conditional distribution of A„ given the 
observations Yi, Y%, Y^. We call Z n the conditional state distribution. The 
process {Z n ,n = 1,2,...} is often called the filtering process, in particular in 
engineering literature. 

For more than half a century scientists have worked on the problem to draw 
conclusions about the hidden process {X n } from the observed process {Y n }, 
under various assumptions about the spaces (S, S) and (A, A) and the tr.pr.fs 
P : S x S — > [0, 1] and R : S x A — > [0, 1]. Also much work has been done 
when both the hidden process and the observed process are time-continuous 
processes. 

The following is a short list of natural problems: 

1) Use the observations Y\, Y%, ... to estimate the parameters of the tr.pr.f P and 
the tr.pr.f R. (An early paper regarding this problem for the case when both 
the space S and A are finite is the paper [5] by L.Blum and T.Petrie.) 

2) Use the observations Y\ , Y2, Y n to give an estimation of both X n and X n+ \ . 
( When both the hidden process and the observed process are linear Gaussian 
processes this problem falls into to the theory called Kalman filter theory.) 

3) How does Z n depend on the initial distribution p$ ? If Z n becomes insensitive 
to the initial distribution po of Xq, this property is often called the stability 
property of the filtering problem (see e.g [14) ) or simply the forgetting of initial 
distribution property (see e.g [5], chapter 4). This problem has been studied 
actively in the last few years, and we refer to the recent paper [2] by van 
Handel for more information about this important problem. In |14j references 
to other work can be found some of which have a detailed reference list. 

4) Does the filtering process {Z n } converge in distribution and if so is the limit 
distribution independent of the initial distribution po ? This problem is in the 
center of this paper. 

It seems fair to say that the first three problems are important from the 
practical point of view. Roughly speaking problem 1) deals with the problem of 
how to determine the most accurate model, problem 2) deals with the problem 
of how to make the "best" estimation given the observations and problem 3) 
deals with the problem of how dependent our "best" estimation is with respect 
to initial assumptions. The fourth problem is perhaps mainly of theoretical 
interest but since this problem is related to the computation of the entropy rate 
of the observation process it is perhaps fair to also consider problem 4) as being 
of importance from a practical point of view. 

One of the first papers within this field of probability theory is the paper 
[3] from 1957 by D. Blackwell. He considered the case when S is a finite set, 
implying that the {A"„} — process is an ordinary finite Markov chain with finite- 
dimensional tr.pr.m P, and he assumed that the {Y n } —process was determined 
by a "lumping" function g : S — » A such that Y n = g(X n ). The main result in 
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[1] is a formula for the entropy rate of the {Y n } — process, a formula based on 
the existence of a stationary measure for the filtering process Z n . Blackwell also 
showed that the filtering process itself is a Markov chain and proved the exis- 
tence of a unique stationary measure for the filtering process when the tr.pr.m 
P G P(S x S) has "nearly identical rows and no element which is very smaW . 

A classical paper in this field of probability is the paper [20] form 1971 by 
H.Kunita. Kunita assumes that the hidden process is a time-continuous Markov 
process on a compact, separable Hausdorff space, and that the observed process 
is also a time-continuous processes taking values in R n and such that it can be 
described as a stochastic integral based on the Wiener process. Among other 
things also the question regarding convergence in distribution of the filtering 
process is considered in 20 . 

Another rather early paper in this field is the paper 16J from f975. In [16 
it is assumed that both the space S and the space A are finite and that the 
observation process is determined by a lumping function g : S — ► A. It was 
proved that the distributions of the filtering process converge in distribution 
towards a unique limit distribution independent of the initial distribution po 
when P is irreducible and aperiodic and a rather mild extra condition holds. 
(See Condition A of [TB].) In [TB] an example was given that shows that there 
need not be a unique limit measure for the filtering process even if the Markov 
chain {A" n } is irreducible and aperiodic. 

In the paper [TH] from 2006 by F.Kochman and J. Reeds, the result obtained 
in [TB] was generalized and the proof simplified. Condition A of [TB] was replaced 
by a "rank 1 condition" for certain matrix products that arise naturally when 
studying the filtering process for denumerable sets S and A. 

The same year the paper [15] by T.Holliday, P.Glynn and A. Goldsmith was 
published in which also a convergence theorem regarding the distributions of 
the filtering process is proved for the case when both the sets S and A are finite 
sets. The result in 15J is proved under a positivity condition for the tr.pr.f 
(tr.pr.m) from S to A, which is stronger than Condition A introduced in [16j . 

In the paper [5] from 2005 by G.B.DiMasi and L.Stettner the authors con- 
sider the filtering process when the set (S, S) is a complete, separable, metriz- 
able space. They also require that a positivity condition shall be satisfied, when 
proving that the distributions of the filtering process converge in distribution 
to a unique limit measure. They use the so called Hilbert norm to measure 
distances between measures. One drawback with the Hilbert norm is that it 
is equal to infinity if the supports of the two measures under consideration are 
different, and it seems that it is for this reason the authors of [8] need to make 
certain uniform, positivity assumptions. 

Although a few papers have been published in the last decade, roughly speak- 
ing, the convergence problem for the distributions of filtering process (problem 
4 above) has not been considered very often in the literature, in particular if one 
compares with the number of papers dealing with the so called stability problem 
for the filtering process. One reason could be due to the fact that, if the space 
(S, S) on which the hidden process takes its values is not compact, then the set 
of probability measures on the set of probability measures on (S, S) is a rather 
complicated set. Already if S is a denumerable set and we measure the distance 
between two probability vectors on S by the l\ — norm then the set of of prob- 
ability vectors on S will be a nonlocally compact set, and therefore the Markov 
chain associated to the filtering process will be a Markov chain on a nonlocally 
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compact space. Moreover, if both the set S and the set A are denumerable, 
then in general the tr.pr.f of the filtering process will not satisfy a Doeblin con- 
dition nor be tp -irreducible. (See e.g [§] or [5] for the definitions of the Doeblin 
condition and ^-irreducibility.) In fact if we consider two filtering processes 
having different initial distributions then, in a generic situation, the supports 
of the distributions of the two filtering processes will be non-overlapping, which 
causes some technical complications. 

We shall now show how the filtering processes described above are interre- 
lated with Markov chains generated by tr.pr.fs induced by partitions of tr.pr.ms, 
when the spaces S and A are denumerable. 

Thus, let now S denote a denumerable space and let P be a tr.pr.m on S. 
Let A be another denumerable space, let R = {(R)i. a ■ i & S, a £ A} be a 
tr.pr.m. from S to A and let po £ V(S). As described above ( see also [21]), let 

{X ,X 1 ,X 2 ,X 3 ,... } and {Y U Y 2 ,Y 3 , ... } 

be the two stochastic processes taking values in S and A respectively, that are 
generated by po, P and R. 

For n = 1,2, ... and i £ S let 

Z nti = Pr[X n =i\Y 1 ,Y 2i ... ) Y n ] (3) 

and set 

Z n = (Z n>i , i£S). (4) 
Next let us for each a £ A define a S x S matrix M(a) by 

(M(o))<j - (P)ij(R)i,a, ieS, jeS. (5) 
It is easily checked that 

E M («) = p 

and hence if we set Mo — {M(a) : a £ A} then Mo is a partition of P. 

The interesting fact with this partition is the fact that the distribution of 
the conditional state distribution Z n is given by (po, •) that is 

Pr[Z n £ B]=P% (p ,B), B££,;n= 1,2,.... 

a relation which in principal was proved already in pQ. (See also [26] . [16) 
and in particular |19j.) Hence in order to prove convergence in distribution 
of the distributions of the filtering process it suffices to prove convergence in 
distribution of the Markov chain generated by the tr.pr.f P^ (-, ■) induced by 
the partition Mo = {M(a) : a £ A} of P, where thus a matrix M{a) of Mo is 
defined by ©. 

The main application of the convergence result regarding Markov chains 
generated by tr.pr.fs induced by partitions of tr.pr.ms, proved in this paper, is 
thus to the filtering process {Z n , n = 1,2, ...} for denumerable spaces S and A. 
We also use the convergence result to generalize Blackwell's entropy formula for 
functions of Markov chains from finite state spaces to denumerable state spaces. 
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1.3. The main theorem. Let S be a denumerable set and let as above K 
denote the set of probability vectors on S that is 

K = {x= ((x)i,i e S) : ^2(x)i = 1, (x)i > 0}. (6) 
Let M be a S x S matrix. We define the norm ||M|| by 

llA/H^suplllxMlhlMI^LxeR 5 }, (7) 

where thus 1 1 • 1 1 denotes the l\ — norm and R the real numbers. 
Let U denote the set of S — dimensional vectors specified by 

U = {u = ((u)i, i € S) : Ui > 0, and sup{u^ : i 6 S} = 1 } 

and let W denote the set of S x S matrices defined by 

\] = {W = u c v: u eU,v e K} 

where u c denotes the transpose of u. Note that if W € W then 1 1 | = 1 since 
< X^gs u i v j ~ u i — 1 ^ or all i G 5 and sup i Ui — 1. We call an element in W a 
nonnegative rank 1 matrix of norm 1. 

Next, let P be a tr.p.m on S and let M. = {M(w) : w € W} be a partition of 
P. If {u>i, u>2, w m } is a finite sequence of elements in W we use the notations 

W m = -j>l, W 2 , --Wm}, 

and 

M(w m ) = M(wi)M(w 2 )...M(w m ). 

Our first condition is a rather straight forward generalization of a "rank 1 
condition" introduced in [19] for the case when the state space S is a finite set. 
(See [H] page 1807.) Here and throughout this paper we let e l , i S S denote 
the vector in K defined by 

(e% = 1. (8) 

Condition Bl. There exists a nonnegative rank 1 matrix W = u c v of 
norm 1, a sequence of integers {n\ 1 n 2l ■■■} and a sequence {wj j ,j = 1,2, ...} of 
sequences w" J = {wij,W2j, ■■■,w, lj j} S W nj , such that ||M(w" J )|| > 0, j — 
1, 2, ... and such that for all i € S 

lim ||e 4 M(w" j )/||M(w" j )|| - eW|| = 0. 

j — > oo ^ ^ 

It is not difficult to prove that if the state space S is finite and the partition 
is determined by an "observation matrix" R (see ([S])), then Condition Bl is 
equivalent to the " rank 1 condition" introduced in |19) . 

In order to define our next condition we first need to introduce the notion 
barycenter. The barycenter of a measure /i £ V(K) is defined as that vector 
b(fi) € K whose i : th coordinate (b(fi))i is defined by 

(b(n))i = f (x)^(dx). (9) 
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That the vector b(fi) belongs to K follows from the fact that the set K is 
convex. We let V(K\q) denote the subset of V(K) such that each p G "P(A'|g) 
has barycenter equal to q. 

We are now ready to introduce the condition under which the main theorem 
of this paper is proved. Let S be a denumerable set, let P be a tr.p.m on S such 
that P is irreducible, aperiodic and positively recurrent, let 7r denote the unique 
probability vector in K such that irP — ir and let A4 = {M(w) : w G W} be 
a partition of P. Since P is irreducible and positively recurrent it follows that 

(tt)< > 0, V* e S. 

Condition B. For every p > there exists an element i$ G S such that if 
C C K is a compact set satisfying 

»(Cn{x:(x) i0 > (7r)i /2}) > (7r) i0 /3, V M 6^W, (10) 

then we can find an integer N, and a sequence {wi,W2, ...wn} of elements in 
W, such that, if we set 

M(w N ) = M(w 1 )M(w 2 )...M(w N ), 

then 

||e l °M(w N )|| > 
and if x £Cfl{x: (%)i > (7r)i /2} then also 

||(a;M(w N )/||a ; M(w N )|| - e IO M(w N )/||e J °M(w N )||)|| < p. 

That there exists a compact set C such that (fTU|) holds is proved in section 4. 

It is not very difficult to prove that Condition Bl implies Condition B when 
P is irreducible, aperiodic and positively recurrent, a fact we shall prove in 
section 9. 

The main theorem of this paper reads as follows: 

Theorem 1.1 Let S be a denumerable set, let P G PM(S x S) be irreducible, 
aperiodic and positively recurrent, let n G K satisfy ttP — n and let M be a 
partition of P. Suppose also that Condition B holds. Then Pjvi is asymptotically 
stable. 

1.5. Some remarks about the proof of the main theorem. In this 
subsection we first introduce some further notations and concepts and state a 
few obvious but important properties for partitions. 

As usual let 5* denote a denumerable set, let K denote the set of probability 
vectors on S and let £ be the Borel field induced by the l\ — metric on K. 
Let F[K] denote the set of real, bounded functions on K, let B[K] denote the 
set of real, bounded, £ — measurable functions on K, for u G C\K] define 
j(u) = sup-OO) - u(y)\/\\x -y\\ : x,y G K, x^ y}, let Lip[K] = {u G C[K] : 
j(u) < oo} and Lip\\K\ ~ {u e Lip[K] : j(u) < 1}. 

Now let P G PA^S* x S), let M = {M(w) : w G W} be a partition 
of P and let Pm denote the tr.pr.f on {K,£) induced by M.. The mapping 
T M : F[K] — > F[K] defined by 

T M u{x)= u{xM(w)/\\xM(w)\\)\\xM(w)\\, x e K 

wew(x) 
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where as above W M {x) = {w E W : \\xM(w)\\ > 0} (see ©), will be called 
the transition operator induced by M. For u £ F[K] it is clear from the 
definition of Px(-, •) (see ([T])) that 

T M u{x) = u(xM(w)/\\xM(w)\\)\\xM(w)\\= f u(y)P M (x,dy). 

w<£W(x) JK 

The mapping P M :V{K)-> V{K) defined by 

PmKB) = [ P M (x,B)n{dx), Be£ 

JK 

will be called the transition probability operator induced by M. 
For u G B[K] and /i G V{K) we shall - when convenient - write 

— I u(x)/j,(dx). 
Jk 

It is well-known that 

(T M u,n) = {u,P M fj)- (11) 

(See e.g. [IS], chapter 1, section 1.) 

A crucial property for partitions of matrices is the following. Let Pi and P2 
be two tr.pr.ms in PM(S x S), let Mi = {M(wi) : wi e Wi} be a partition of 
Pi and let M2 = {M(iu 2 ) : W2 & W2} be a partition of P 2 . Define the set 

M 3 = {Af(wi)Af(w 2 ) : wi S Wi,w 2 S W 2 }. (12) 

Then tV{ 3 is a partition of the matrix P1P2 ■ The proof is elementary. We call M3 
the product of A4i and ^2 and we write M3 = Aii ■ Mi- It is also elementary 
to show that if we have three partitions M.\,M.2 and M.3 with matrices of the 
same format then 

{Mi ■ M 2 ) ■ M 3 = Mi ■ (M 2 ■ M 3 ). (13) 

Therefore, if M — {M(w) : w S W} is a partition of P, then M n is well-defined 
and is a partition of P™. 

Another obvious relation is 

Y \\xM(w)\\ = 1, x€ K (14) 
wew 

if {M(w) : w £ W} is a partition of a tr.pr.m P since J2 W £\v \ \ x M(w)\\ — 
\\xP\\ = lifxeK. 

Now some remarks about the proof of Theorem 1.1. When proving asymp- 
totic stability for a tr.pr.f Q on some general metric space (]C,A) one strat- 
egy is to first verify that there exists a point x* S K, such that the sequence 
{Q n (x*,-),n — 1,2,...} of probability measures is a tight sequence which to- 
gether with Feller continuity implies that there exists an invariant measure for 
Q, that is a measure v such that 

/ Q{x,B)v(dx) = u(B), VBeA. 
Jk. 
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Then, by assuming some kind of contact condition, recurrence condition or con- 
traction condition, one proves that if the function u belongs to a sufficiently 
large set of functions, then 

lim \T n u(x) - T n u(y)\ = 0, x, y G JC, (15) 

n — >oo 

where thus T denotes the transition operator associated to the tr.pr.f Q. Finally 
one uses this limit result to prove both 1) that Q has only one invariant measure 
and 2) that Q is asymptotically stable. 

This is precisely the strategy that we shall use in this paper. In order to prove 
that there exists a point x* £ K such that the sequence {P^ /! (x*, •), n — 1,2, ...} 
of probability measures is a tight sequence we shall need two important facts. 
The first fact is that for each q £ K the set V(K\q) of probability measures with 
equal barycenter is a tight set of probability measures; this is proved in section 
4. The second fact we need is that, if it £ K satisfies irP = it and p £ V(K\w), 
then it follows that Pm/i £ V{K\k) for any partition M. of P; this is proved 
in section 5. From these two facts it follows immediately that the sequence 
{P^ /) (7r, ■), n = 1, 2, ...} is a tight sequence if nP = tt. 

To prove that (| 1 5j) is satisfied we first of all use a universal inequality for the 
transition operator Tm induced by a partition M, which reads as follows: 

\T M u(x) - T M u(y)\ < 3-y(u)\\x - y\\, x,y £ K, u £ Lip[K}. (16) 

In principal this inequality was already proved in [T5], but the universal char- 
acter of this inequality was not observed in [16j . 

From (|16jl follows easily that the sequence {Tj^u, n = 1,2, ...} is an equicon- 
tinuous sequence when u £ Lip[K]. Now if the set (K,£) would have been a 
compact set then, as is well-known and easily proved, in order to prove ([15]) it 
is enough to verify that to every p > there exists an integer N and a number 
a > such that for any two initial points x and y in K 

Pr[8(X N {x),X N {y))<p]>a (17) 

where thus {X n (x), n = 0, 1, 2, ...} and {X n (y), n = 0, 1, 2, ...} are two indepen- 
dent Markov chains generated by ■) the former starting at x the latter at 

y- 

Since, in our case, the space (K, £) is a non-compact space if the set S is 
dcnumerable and infinite, one can not in general expect that one can find an 
integer N and a number a > such that (fT7|) holds for all x,y £ K. For this 
reason we have introduced a notion we call the shrinking property which is de- 
fined as follows: 

The shrinking property. Let Q be a tr.pr.f on a metric space (IC,A) 
where A is the Borel field generated by a metric Let Lip[lC\ denote the set of 
Lip schitz- functions on (tC,A), and let T be the associated transition operator. 
Lf for every p > there exists a number a, < a < 1 such that for every 
nonempty, compact set A C K, every r\ > and every k > 0, we can find an 
integer N and another nonempty, compact set B C K, such that, if the integer 
n > N , then for all u £ Lip[JC] 

sup \T n u{x) — T n u{y)\ < i]j(u) + kosc(u) + apj(u)+ 

x,y£A 
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(1-a) sup \T n - N u( Zl )~T n - N u{z 2 )\ 

Zl ;Z 2 £B 

then we say that Q has the shrinking property. We call a a shrinking number 

associated to p. 

Note that a shrinking number a only depends on which p is chosen, whereas the 
integer N and the new nonempty compact set B depend on the initial choice 
of the nonempty compact set A and depend also on how small we have chosen 
the numbers n and n; roughly speaking, a large compact set A, a small 77 and a 
small k require a large integer N and a large compact set B. 

Now by showing that the tr.pr.f Pm has the shrinking property, when Con- 
dition B is satisfied, and using an auxiliary theorem for Markov chains having 
the shrinking property proved in section 6, we are able to prove (|15|) and then 
it is a simple matter to prove asymptotic stability, and hence Theorem 1.1. 

1.6. Exceptional cases. One consequence of asymptotic stability is that 
there only exists one invariant measure. Therefore if Pm fulfills the hypotheses 
of Theorem 1.1 then the equation 

/ P M {dx,B)p{dx) =/*(£), VBe£ (18) 
Jk 

has a unique solution in T(K). In the paper [3] Blackwell conjectured that the 
equation (TT%|) has a unique solution if S is finite, P £ PM (S x S) is indecom- 
posable and the partition is determined by a lumping function on S. However, 
there are counterexamples to this conjecture and one such counterexample was 
presented in [IB]. In fact, already in 1974, H.Kesten constructed an example, 
not published before, which shows that the tr.pr.f Pm can in fact even be 
periodic ([18J). In section 8 we present this counterexample. 

In section 8 we also state and prove a theorem with hypotheses that guar- 
antee that Pm is n °t asymptotically stable. To state the theorem we need two 
further notations, K(x,M) and K$>. Let M — {M(w) : w £ W} be an ar- 
bitrary partition of a tr.pr.m. Recall that Wm{ x ) is defined by (J2J). For each 
x £ K we define 

K{x,M) = {y £ K :y = xM(w)/\\xM(w)\\ some w £ W M (x)}- 
Let S be a denumerable set and S' C S. We define 

K s , = {x£K: [x)i = 0, i <£ S'}. 



Theorem 1.2 Let S be a denumerable set, let P £ PM(S x S) and let M. = 
{M(w) : w £ W} be a partition of P. Now suppose that there exists a subset 
S' C S consisting of at least two elements, such that 

1) for every x £ K$> the set U^ =1 K(x,M n ) consists of isolated points, 

2) if both x and y are in Kg' then Wm™( x ) = ^M n i.u)i n — L 2, .... 
and 

3) if x and y in Ks>, n> 1 and w n £ then 

||(aM(w»)/||a:M(w-)|| - yM(w»)/|| V M(w»)||)|| = \\x-y\\. 
If these conditions are fulfilled then Pm is not asymptotically stable. 



10 



Remark. If we could prove that the hypotheses of Theorem 1.2 are also nec- 
essary in order for Pm to be a tr.pr.f which is not asymptotically stable when 
P e PM(S x S) is irreducible, aperiodic and positively recurrent, we would be 
able to formulate an easily checked criterion for deciding whether a tr.pr.f Pm 
induced by a partition M. of a tr.pr.m P is asymptotically stable or not. □ 

Conjecture 1.1 Let S be a denumerable set, let P £ PM(Sx S) be irreducible, 
aperiodic and positively recurrent, let M. = {M(w) : w S W} be a partition 
of P and let Pm be the tr.pr.f induced by Ai. Then either the hypotheses of 
Theorem 1.1 or the hypotheses of Theorem 1.2 are satisfied. 

In section 8 we also describe a class of tr.pr.ms and partitions for which the 
hypotheses of Theorem 1.2 are fulfilled, and show that both Kesten's example 
and the counterexample in |16j belong to this class. 

1.7. Blackwell's entropy formula. Let S be a denumerable set, let P e 
PM(SxS) be irreducible, aperiodic and positively recurrent, let {^o, Xi)X%, ...} 
denote a Markov chain with tr.pr.m P and initial distribution ir where ir satisfies 
tcP = 7T, let g : S — > A be a "lumping" function on S, let M. — {M(a) : a E A} 
be the partition of P defined by 

(M(a)) itj = if g(j)=a 

(M(a)) id = if g(j)^a, 

and define Y n , n — 0,1,2,... by Y n — g(X n ),n — 0,1,2,.... In the paper [4] 
it was assumed that S is finite, and it was shown that the entropy rate of the 
{Y^}-process is given by 

£ / h(\\yM(a)\\Mdy) 
a<=A jK 

where 

h(t) = -(l/ln2)tln(t), if < t < 1, and h(0) = 0, 

and where /i is an invariant measure associated to the tr.pr.f Pm induced by 
the partition Ai. In section 13 we generalize the entropy result obtained by 
Blackwell to Markov chains on denumerable state spaces. By using convexity 
properties proved in section 11 we can also give lower and upper bounds for the 
entropy. 

1.8. The plan of the paper. In section 2 we prove the rather obvious 
fact that, if S is a denumerable set, P\ and Pi belong to PM(S x S), Aii is a 
partition of P\ and M2 is a partition of P%, then 

PmxM-2 = Pm 2 Pm ± 

from which follows that for any partition M 

P n M =P M ™, n = l,2,... 

which is a very useful relation. 
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In section 3 we first prove the universal inequality (|16[) . From this inequality 
it readily follows that for any any partition M. 

7(7m)<3 7 (u), n=l,2,... Vu £ Lip[K}. (19) 

We also introduce the well-known Kantorovich metric for probability measures 
in V{K\ and we present an example which shows that the number 3 on the 
right hand side of the inequality (fT9|) can not be decreased to a number less 
than 2. 

In section 4 we consider probability measures with equal barycenters. We 
prove that for any q £ K the set V(K\q) is a tight set in V(K) and we also 
prove that the Kantorovich distance between a measure /i £ V{K\p) and the set 
V{K\q) is equal to \\q — p\\. 

In section 5, we prove that if P £ PM(S x S) is irreducible, aperiodic and 
positively recurrent, and wP — ir, then it follows that if // belongs to V(K\tt) 
then PmH also belongs to V(K\tt). In section 5 we also prove that the barycenter 
of P^(x, ■) is equal to xP n . 

In section 6 we prove an auxiliary theorem for a Markov chain on a complete, 
separable, metric space for which its tr.pr.f has the shrinking property. In section 
7 we use this auxiliary theorem to prove Theorem 1.1, and in section 8 we give 
some examples for which asymptotic stability does not hold and prove Theorem 
1.2. 

In section 9 we focus on Condition B and first prove that Condition Bl 
implies Condition B. Furthermore we prove that, if the partition M. is such 
that, 1) sooner or later the Markov chain of probability vectors generated by 
P.A4 at some moment will have a finite support with positive probability, and 2) 
a condition similar to Condition A introduced in |16) is satisfied, then it follows 
that Condition B is satisfied. 

In section 10, since we have not been able to find a simple criterion from 
which Condition B follows, we present a concrete random walk example on the 
integers for which Condition B is satisfied. The partition in this example consists 
of just two matrices, M(l) and M(2), such that (M(l))i,j — {P)ij if j is odd 
and zero otherwise, where P denotes the tr.pr.m governing the random walk. 

In section 11 we consider convex functions on the set K. Let C convex [K] 
denote the set of all continuous, bounded, convex functions on K. If u £ 
C convex [K] is such that it can be obtained as 

u = sup{w„ : n £ J\f} 

where Af is an arbitrary index set and each v n is an affine function on K such 
that 

v n (x) = xa c n + b n 

where a n £ l°°(S), y c denotes the transpose of a vector y and b n is a real 
number, then we say that u £ C' convex [K]. (Here l°°(S) — {a — ((a) i: i £ S) : 

sup; e s IO)il < °°}-) 

In section 11 we prove that a transition operator T^vi, induced by a partition 

A4, maps C' convex [K] into C convex [K] . We also show that if M is a partition of 

a tr.pr.m P and the vector n in K satisfies itP — tt then, if u £ C' convex [K], we 

have the following string of inequalities for n = 1, 2, ... : 

(u, P£A) < (u, < («, P£+V 9 ) < (u, Pfrl> q )- (20) 
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Here and throughout this paper, for each x G K we let 8 X denote the measure 
in V{K) which is defined by 

S x ({x}) = 1, 

and we let ip x be the measure in V(K) which is defined by 

MW}) = Mi, 

The inequalities in (f2"Uj) are reminiscent of results obtained by Kunita in [2U] . 

In section 12 we introduce a martingale by reversing the order of the matrix 
multiplication. 

In section 13, as mentioned above, we generalize Blackwell's formula for the 
entropy rate of functions of Markov chains. We also use the inequalities (|20p to 
obtain upper and lower bounds of the entropy rate. 

In section 14 finally, we give a proof of the rather intuitive fact that "Pm, as 
defined in subsection 1.1, is in fact a transition probability function. 

2 Further notations and some multiplication prop- 
erties. 

Let S be a denumerable set, let W denote another denumerable set, and let 
A4 = {M(w) : w E W} denote a denumerable set of nonnegative, S x S matrices 
such that 

£ £ (M( W )) M - = 1, Vz G 5. (21) 

j'es wew 

We let (? (5) denote the family of denumerable sets of nonnegative, SxS matrices 
for which each set M = {M(w),W G W} G G(S) is such that holds. 

For each M. G QiS) we can - of course - associate a matrix P G PM(S x 5*) 
defined by 

and by definition it follows that M. is a partition of P. Therefore we call an 
element M. G Q(S) a partition. 

Throughout the rest of the paper if M. G S(S') then P denotes the tr.pr.m 
associated to M, and Pa^, Pm and Pm w hl denote the transition probability 
function, the transition operator and the transition probability operator induced 
by Ai as defined in the introduction. Also, if a denumerable set S is given, the 
capital letter K always refers to the set defined by ©. 

A vector x G K will be called positive if (x)i > 0, Vi G S and we set 
K + = {x € K : x positive}. If 7r G X + , we let Gtt(S) denote the subset of 
G(S) consisting of those partitions M. for which the associated tr.pr.m P is 
irreducible, aperiodic and positively recurrent satisfying ttP = n. 

From the multiplication properties of partitions it is clear that G(S) is a 
semigroup, and since the set consisting of just the unit matrix belongs to G(S), 
it is a semigroup with unit. Clearly Gtt(S) is also a semigroup for each it G K + . 

We also define G'(S) = ^{-k^k+}G-k{S), and let G ' Z (S) denote the subset of 
G'(S) consisting of those partitions M. = {M(w) : w G W} for which 

sup ^2 -(]n\\xM(w)\\) ■ \\xM(w)\\ < co. (22) 
X&K wew M (x) 
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Next, let n > 2 and consider the product M.\M.i---M-n where M.i — {Mi{iOi) : 
Wi 6 Wi},i = l,2,...,rt belongs to G{S). We denote the product by M n , we 
set W™ = n™=i an d as before we let w n denote an element in VV n and we 
let M(w n ) denote an element in M n ; thus M n = {M(w n ) : w n E W n }. For 
x E K we let 

= {w 11 S W* : ||xM(w n )|| > 0}. 

(Compare ©.) 

The following trivial scaling property for matrix products will be used fre- 
quently. We omit the proof. 

Lemma 2.1 Let A and B denote two matrices (not necessarily of finite dimen- 
sion) and assume that AB is well defined. Let x be a row vector and assume 
also that 

1) xA is well defined, 2) < ||^^4|| < oo and 3) < ||x^4i3|| < oo. Then 

xAB/\\xAB\\ = {xA/\\xA\\)B/\\{xA/\\xA\\)B\\. (23) 

As above, let Ft denote the real numbers, and let R denote the set of S- 
dimensional vectors with real coordinates, for i£R s define ||a;|| = X)iGS 
and let h(S) denote the set {x € R s : ||x|| < oo}. Let denote the zero vector 
in R 5 and let l[(S) = h(S) \ {0}. Let K_= {x € h(S) : \\x\\ = 1}. We now 
denote the projection map from to K by [•]. Thus if y € li(S) then 

[y] = y/\\y\\- (24) 

With this notation, when i£E s \ {0} we can write the formula (|23|) as 

[xAB] = [[xA]B]. (25) 

Next some further notations. Let x € K, let M.. L S G(S), i = 1,2,... 
and, for n = 1,2,..., let M n = {M(w n ) : w n e W"}. Let N be an integer 
> 1, let 1 < n < N, and let x e K. If (wi,w 2 , ...w N ) <E W N is such that 
w" 6 W%i n {x), n = 1,2, ...,7V, we let x„(w n ), for n = 1,2, ...,7V, be defined by 

x n (w n ) = [a;M(w n )] = a;M(w n )/||a;M(w n )||. 

For n — 1 we write xi(wi) — xi(w 1 ). By using the scaling property it is easy 
to conclude that 

Wm*{%) = {(wi,w 2 ) £ W 2 : w\ € VVmxOe) «nrfw2 € Wm 2 (ii(«)i))}, (26) 
Next we define the function g : [0, 1] — > R by 

=-(l/ln(2))ln(t), if t>0 and g(0) = 0. 
We let ft : [0, 1] -> [0, l/(eln(2))] be defined by 

/i(t) = 3 (t) . i. 

It is well-known and easily proved that h : [0, 1] — > [0, l/(eln(2))] is a continuous 
and concave function on [0, 1]. For M. — {M(w) : w E W} E Q ' Z {S) we set 

H M (x) = J2 9(\\xM(w)\\)\\xM(w)\\ = J2 H\\*M(w)\\). (27) 
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Before we state our next theorem let us note that if Mi = {Mi{w{) : w\ E 
Wi} € Q{S) and M 2 = {M 2 {w 2 ) : w 2 € W 2 } € G(S) then 

53 \\xM 1 {w 1 )M 2 {w 2 )\\ = \\xM 1 {w 1 )\\. (28) 

U126W2 

because of (HU and the scaling property (f2"3"|). 

Theorem 2.1 Let Mi, M 2 EG{S). Then: 
a) 

Pm^Mi = PMiMi'i (29) 
6; Suppose Mi € £'' Z (S) and M 2 € 0' ,2: (S). Then M X M 2 E G'-^S), and 

=iW*)+ E I H\\yM 2 (w 2 )\\)P Ml 5x{dy). (30) 
to 2 ew 2 ^ 

Proof. Since 
and 

(u,Pm 1 m 2 ^) = (TmiM 2 u,LJ>) 

if u € C[ivf] and fi E P(K) because of in order to prove a) it therefore 

suffices to prove that if u E C[K] and x E K then 

Tm 1 m 2 u (x) = T Mi Tm 2 u ( x )- (31) 

Clearly 

Tm 1 m 2 u { x )= 5Z u(x 2 ((wi, w 2 ))) • ||a;Mi(wi)M2(w2)||, 

(w 1 ,tu 2 )6VV^ l2 (x) 

and by using the scaling property (125[) and the relation (f2l)|) we find that 
53 tt(ar2((toi,«ia))) • ||xM 1 (wi)M 2 (u; 2 )|| = 

(wi,t«2)£W^ (2 (x) 

J] 53 u([a;i(w 1 )M2(w;2)])-||a; 1 (u;i)M 2 («; 2 )||-||xM 1 (i ( ; 1 )|| = 

53 Tm 2 u(xi(wi))(\\xMi(w 1 )\\) = T Ml T M2 u(x). 

»i6Wm,(i) 

Hence pip holds and hence a) is proved. 

The proof of b) is similar. We first assume that Mi, M 2 , MiM 2 E G > Z (S). 
By definition 

H Ml M 2 (x)= 53 g(\\xM(wx)M(w 2 )\\) ■ \\xMi(wi)M 2 (w 2 )\\. 
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Now from the scaling property and the additivity property of the function g we 
find, for x € Wm x { x )i that 

g(\\xM 1 {w 1 )M 2 (w 2 )\\) ■ \\xM 1 {w 1 )M 2 {w 2 )\\ = 

ff(||a;i(«;i)M a (tfla)|| • WxM^W) • ||a:Mi(«;i)M2(tfl2)|| = 
(g(\\x 1 (w 1 )M 2 (w 2 )\\) + gdlxM^w^W)) ■ \\xM 1 {w 1 )M 2 {w 2 )\\. 

Hence 

H Mi m 2 (x)= E g{\\xi{ Wl )M 2 {w 2 )\\) ■ \\xMi{ Wl )M 2 {w 2 )\\+ 

(w 1 ,w 2 )ewl AiM2 {x) 

E ffdlxMiK)!!)) • WxM^wjMtfa)]]. 

The second term is equal to (%) because of (j25|) . Furthermore by using 
the scaling property again we find that 

ff(||a;i(tti)Afa(ioa)||) • ||sMi(u>i)M 2 (tua)|| = 
g{\\xi{ Wl )M 2 {w 2 )\\) • \\x 1 (w 1 )M 2 (w 2 )\\ • HxMxK)!! 
if x S Wmi( x )i an d then, using also &g§ , we find 

E g{\\xx{wx)M 2 {w 2 )\\) • ||a;Mi(«;i)Af 2 (t«2)|| = 

E <?(|M«;i)M 2 (™ 2 )||) ■ ||^K)M 2 («; 2 )|| • ||sMi(ici)|| - 

(w 1 ,w 2 )ew 2 MiM2 (x) 

E \\xMx(wi)\\ Yl g(\\x 1 (w 1 )M 2 (w 2 )\\)-\\x 1 (w 1 )M 2 (w 2 )\\ = 

wiSWjvij (x) w 2 eWM 2 (x\(w 1 )) 

E HsMiCtflOH E K\\ x iMM 2 (w 2 )\\) = 

wi^.Wmx(x) w 2 eW 2 

E E IW^OIIMINK^K)!!). 

ii)2eW 2 UlieWMj (x) 

From the definition of Pm it then follows that 

E E ||a;M 1 (ti; 1 )||MI|a:i(t»i)M 2 (tfl 2 )||) = 
w 2 ew 2 w 1 ew Ml (x) 

E / %M 2 (w 2 ))P 
w 2 ew 2 Jk 

Hence 

H Ml M 2 (x) = H Ml (x) + E / Kv^iw^VM^x.dy). 

w 2 ew 2 Jk 

Finally since Pmi^(') = Pmi^i ') we can conclude that (f3"U|) is true. 

In order to prove that MiM 2 € <7' ,ir (S), if both Mi € and 7W 2 € 

5 ^(S 1 ), we only have to follow the reasoning above backwards. □ 

Part b) of Theorem 2.1 will be used in section 13 to prove entropy results. 
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Corollary 2.1 Let S be a denumerable set and let M be a partition of P £ 
PM(S x S). Then, for each positive integer n, 

P M =P M «, (32) 

and 

TjU — Tm^- (33) 



Proof. Relation (|32| follows from (|29| by induction, and (J33J) follows from 
and ([H])- □ 

3 A universal inequality. 

The following inequality, which we formulate as a theorem, is in principal proved 
in [IB], section 4. If u S -Ff-K], we define ||u|| = sup xeK \u(x)\. 

Theorem 3.1 Let S be a denumerable set and suppose M £ G{S). Then, 

u e Lip[K] => T M u e Lip[K] (34) 

and 

7(T M «)<(||«|| + 27(u)). (35) 

Proof. We shall follow the proof of Lemma 4.3 in [16] closely. To simplify 
notations we shall write T instead of . 

In order to prove (f3"lj) and (f3"51 it suffices to prove that for any u £ Lip[K] 
and any two elements x,y € K 

\Tu{x) - Tu(y)\ < (|H| + 2 7 («)) ■ \\x - y\\. (36) 

Let S! = {i: (x)i > 0, (y) 4 > 0}, S 2 = {z : (1)4 > 0, (y)* = 0}, 5 3 = {1 : 
= 0, > 0}, let X = {M(w) : w e W} and 

Wi = {w e W : ||e l M(w)|| > 0}. 

Using the fact that for an arbitrary z <E K and arbitrary uj € W, we have 

\\zM(w)\\=Y / (z) i -\\e i M(w)\\, 

we obtain, by using the triangle inequality and (114[) , that 

|Tu(x) - Tu(y)\ = 

E(y)« E ^(2/iM)-|| e l MH||-E(^ E u( yi (w))-\\e*M(w)\\}\< 

ieSi weWi ies 3 weWi 

Ei(^-(^i-ihi E ll e ' iM HH+ 
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|£(ff)i £ ^iH)-||e I MH||-^(y) J J] U ( yi H).||e l MH|||< 

\\ x - y \\.\\u\\+ 7 (u)J2(y)i E lkiM-yiHII-lle l MH||. (37) 

ieSi lueWi 

In order to prove that the last term is less than 27(11) ||x — y\\ we shall use 
the following inequality. Let a and b be two nonzero vectors in a normed vector 
space. Then 

||(a/||a||-6/||6||)|| = ||(a/||a||-a/||6|| + a/||&||-6/||6||)||< 
|(IH|-||6||)|/||6|| + ||a-6||/||6||<2||a-&||/||6|| 

CLTld h,GTlC& 

||(a/||a||-6/||6||)||<2||o-6||/||6||. (38) 

Using d3HJ) we find that 

\\ Xl (w) - yi (w)\\ < 2\\xM(w) - yM(w)\\/\\yM(w)\\ (39) 

if ||a;M(ui)|| • ||yM(w)|| > 0, and by using (|39|l . the triangle inequality and 
change of summation order, we obtain 

7(«)X>)< E \\xi{w)-y l {w)\\-\\e i M{w)\\< 

ieSi wGWt 

7(«)X)(»)< E (2\\xM(w)-yM( W )\\/\\yM(w)\\)-\\e*M( W )\\< 

ieSi weWi 

2 7 (u) E \\xM{w)-yM{w)\\Y J {y)i\\e i M{w)\\/\\yM{w)\\ = 
wew %£S 

2 7 (u) E \\xM{w)-yM{w)\\<2 1 {u) E E 

mew lueWiei jei 

2 7 ( U ) e n -»EE (^M)ij = 
2 7 wEk-2/iiE( p )^' = 

2 7 (u)El^-^l= 2 7(w)||x-2/|| 
which combined with (|3"T|) implies (IBU]) . □ 

Corollary 3.1 Let S be a denumerable set, let M £ G{S) and u 6 Lip[K]. 
Then 

l{T M u) < 37(u). (40) 

Proof. Let u G Lip[K], set t; = u/j(u), set osc(u) = sup{«(a;) — w(y) : x,y *E K} 
and define «o = i> — osc(v)/2 — inf{u(a;) : a; € K}. Since sup{||x — : x,y G 
_ftT} = 2 and 7(1*0) = 7 («) < 1 it is clear that ||«o|| < 1- Hence by Theorem 3.1 
follows that jiTMVo) < 3 and hence 7 (Tmu) — j(u)j(Tmv) = 7(w)7(XVfi>o) < 
37(w). □ 
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Corollary 3.2 Let S be a denumerable set, let M. £ Q(S) and let u £ Lip[K]. 
Then 

liT^u) < 37(u), n=l,2,... . (41) 

Proof. Let u £ Lip[K]. From Corollary 2.1 follows that ^(TJ^u) — ^(Tm^u) 
and then (l4"Tj) follows from Corollary 3.1. □ 



Next let Q(K) denote the set of nonnegative, finite, Borel measures on (K, £) 
with positive total mass. For [i £ Q{K) we write ||/z|| = fx{K). For r > we 
define Q r (A') = {/j£ Q(if) : = r}. If both fi,v £ Q r (A) we define - for 
any r > - 

d K (p, v) = sup{ / u(y)fi(dy)- / u{y)v(dy) : u £ Lipi[K] }. (42) 

Note that if /x, f S Q r (^) then fi/r, v/r £ and 

d K ( f i,v) = Md K (fi/\\n\\,v/\\ fJ ,\\). (43) 

Note also that 

<Mm, v) = sup{ / u(y)fi(dy)- / u{y)v{dy) : u £ Lipi[K], \\u\\ < 1 } 

since sup{| |x — y|| : x,y £ K} = 2. 

That djr(-, •) determines a metric on V(K) is well-known, (see e.g. [TO], 
Chapter 11, section 3,) and from (|43|) follows that determines a metric also 
on Q r (-ftT) for any r > 0. We shall call the metric d^(-,-) the Kantorovich 
metric. 

From the definition of •) it readily follows that 

d K {Sx,S v ) = S(x,y) = \\x - y\\. 

If V C V{K) and fi £ V{K) we define 

dK(ji,V) = wi{d K {n,v) :v£V'}. 

We shall later have use for the following inequality which follows easily from 
the definition of d/f (•, •) and the triangle inequality. 

Proposition 3.1 Let lii,v\,v<i £ Q{K) satisfy \\n\\\ — and \\n%\\ = 

H^ll- Let a > and j3 > be two real numbers. Then 

d K (a^i +/3^ 2 ,a^i + Pv 2 ) < ad K {Hi,v\) + /3d K {^2, v 2 ). 

It is well-known that the Kantorovich metric 8k on Q r (K) can be defined 
in another way. Let K 2 = K x K , let E 2 = £ <X> £ , let r > and let Q r (K x K) 
denote the set of nonnegative measures on (K 2 ,£ 2 ) with total mass equal to 
r. For any two measures fi, v £ Q r (K) we let Q r (n,u) denote the subset of 
Q r (K x K) consisting of those measures fi(dx, dy) such that 

fi(A,K)=fj l (A), VAe£, 

and 

fl(K,B) =v(B), VBe£. 
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d K {^,,v) = inf{ / 5(x,y)fi(dx,dy) : p,(dx,dy) G Q r {n,v)}. (44) 

JKxK 



Then 



A proof of the equality between (|42|) and (|44l when r = 1 can be found in 
[TO] , section 11.8, and the equality between (|42| and (|44| for r 7^ 1 follows then 
by using the relation (j4"3"]) . The proof of the fact that the two definitions of cIk 
give the same value goes back to Kantorovich (see |17j). For a short overview 
of the Kantorovich metric and some applications, see [27] . 

Having introduced the metric dx the following corollary to Corollary 3.2 
follows immediately. We omit the proof. 

Corollary 3.3 Let S be a denumerable set, let P £ PM(S x S) and let Ai be 
a partition of P. Let n and v be two arbitrary measures in V{K). Then 

d K (P]U ^Pm v ) < *d K {lJL,v) for n= 1,2,... . (45) 

It is not difficult to construct an example that shows that the constant 3, 
occurring on the right hand side of formula (|4"5]) , can not be replaced by a 
constant strictly less than 2. Take e.g S = {1,2,3}, let P — I, the identity 
matrix, and let the partition consist of two matrices M (1), M(2) such that the 
first two columns of P and M(l) are equal and the last column of P is equal 
to the last column of M(2). Let x = (1 — e, e, 0) and y = (1 — e, 0, e). Then, for 
< e < 1, d K (5 x ,S y ) = 2e and d K (PMS y , Pm^x) = d K {$x, S y )(2 - e). 

Conjecture 3.1 Let S be a denumerable set, let M. S Q{S) and /i, v <G V(K). 
Then 

d K {pMH,PMv) < 2d K (n,v). 

We end this section by stating another well-known result concerning the 
metric dx(-,-)- See e.g. [TU], Theorem 11.3.3. 

Proposition 3.2 Let \i € V{K) and {fi n ,n = 1,2, ...} be a sequence of proba- 
bility measures in ViK) such that 

lim d K {p. n ,lj) = 0. 

n — >oo 

Then for each u S C[K] 

lim / u{y)n n (dy) = / u(y)fi{dy). 



4 On probability measures with equal barycen- 
ter. 

Throughout this section we shall use the set of positive integers when specifying 
the elements of a denumerable set. We let J — {1, 2, ...} either denote a finite 
sequence of consecutive positive integers starting with the integer 1, or the whole 
set of positive integers, and we denote an arbitrary denumerable set by 

S = {ij ■■ J e J}- 
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In the introduction we defined the barycenter b([i) of a probability measure 
/i G V{K) by the relation We use the same definition for a measure fj, £ 
Q(K). Thus if n £ Q{K) we call the vector b(ji) defined by 



= / (y)Mdy), i £ S 
Jk 

the barycenter of /x. Also recall that we defined V(K\q) = {fj, £ V(K) : b(n) = 
q}- 

If the underlying space S is finite, then the space (K, £) is a compact, metric 
space, and from the general theory regarding probability measures on compact, 
metric spaces, it is well-known that V{K) is then also a compact, metric space 
in the topology determined by the Kantorovich metric. From Corollary 4.2 
below it follows easily that V(K\q) is a closed set, and therefore V(K\q) is also 
compact when S is finite, since it is a closed subset of the compact set V{K). 

Next let us prove the following tightness result which is based on a moment 
condition. Recall that a subset V of V(K) is tight if for every e > one can 
find a compact set C C K such that /i(C) > 1 — e, V/i £ V' ■ 

Proposition 4.1 Let S = {ij,j = 1,2, ...} be infinite, let q £ K and suppose 
that there exists a k > such that 

oo 
3=1 

Then the set V(K\q) is a tight set in V(K). 
Proof. Let e > be given and set 

oo 
3 = 1 

Let Ej :C , j — 1,2, ... be defined by 

E j>e = {x £ K : {x) h < 2A/(j 1+K e)}. 

Using the fact that 

oo 

£l/(J 1+K )<^ 

3=1 

it is easy to prove that the set 

E e = C\f =1 E hC 

is a compact set for every fixed e > 0. But from Markov's inequality it follows 
that if fi £ V{K\q) then 

^K\E^)<(q) t] (j 1+K e/2A). 

Hence 

oo oo 

H(K \ (n™iE j>e )) < J2 \ E j>e ) < ^(qh (J 1+K ^/2A) = e/2<e 

3=1 3=1 

and thereby we have proved tightness. □ 

The moment condition of Proposition 4.1 is though unnecessary. The fol- 
lowing is true. 
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Theorem 4.1 Let S be a denumerable and infinite set and let q £ K. Then 
T{K\q) is tight. 

In order to prove Theorem 4.1 our main goal will be to prove the following 
theorem. 

Theorem 4.2 For each q £ K the set V(K\q) is a compact set in the topology 
induced by the metric oIk{-,-)- 

In order to prepare ourselves for a rather short proof of this theorem, we 
shall first prove two lemmas. 

Lemma 4.1 Let S = {ij,j = 1, 2, ...} be denumerable, let r > and let fi, v £ 
Qr(K). Then 

-6(/i)||<d*(i/,/i). 

Proof. Set 

a = b{v), and b = b(fi). 

For igS define 

Ui(x) = (x)i. 

Clearly m £ Lipi[K], i £ S, and by definition 

/ Ui(x)v(dx) — (a)i and I Ui(x)fi(dx) = (b)i for i £ S. 
Jk Jk 

Next define S + by 

S+ = {i: (a) t > or (6), > 0}, 

let n be an arbitrary positive integer, let k be an integer in the interval 1 < k < 
2", and let e n ' k — (ei,k,£2,k, k) specify one vector such that ej.k = +1 or 
ej.k = — 1, j = 1, 2, n. There are exactly 2™ such vectors. 

Next, let v € n,k specify one of the 2 n possible functions one can obtain by the 
definition 

n 

V e n,k = ^ Cj.fcHij ■ 

Since m = (x)i, i = 1,2, ... it follows that if x,y £ K then 

|w £ „, fc (a:) - u e n,* (y) | < ^ | (a;)^ - {y)^ \<\\x-y\\ 

3 = 1 

and hence v e n,k £ Lip\[K] for 1 < k < 2™. Thus 

^) > max{ / v e n,k(x)fj,(dx) — / u e n,fc (x)^(cix) : fc = 1, 2, 2™} = 

n 

= £l(«k-(&U 
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Now, if S+ is finite, by choosing n large enough, it follows that 

n 
i=l 

which was what we wanted to prove. If S+ is infinite, we can conclude that 
#rt(m, > l( a )i — (b)i\ an d by letting n — > oo, we also can conclude that 

Mm^)>Ho-&II = II%)-&(a*)II 

and thereby Lemma 4.1 is proved. □ 

Lemma 4.2 Let S be a denumerable set, and let K denote the probability vec- 
tors on S. Let N be a positive integer and let k = 1,2, ...,N be vectors in 
K. Let /3fe > 0, k = 1, 2, TV, define the measure ip G Q(-K') 6y 

A? 

¥> = 1] 
fe=i 

w/iere as msmo/ 5$ denotes the probability measure with unit mass at £, and define 
the vector a by 

N 
k=l 

Let b = ((b)i,i e S) be a vector satisfying (b)i > 0, ieS, and 

\\b\\ = \\a\\. 

Then there exist vectors (k, k = 1, 2, N, in K such that 

N 

b = £/3fcCfc, 
fe=i 

and such that if we define 

N 

fe=i 

5 K (<p,V) = \\a- b\\. 

Proof. That the conclusion of the lemma is true when N — 1 is easily proved. 
Simply define 

Ci = WPi- 

Since ||6|| = ||a|| = (3\ it is clear that = 1 and hence (i £ if since 

(b)i > 0, i e S. Now defining * = faS^ we find 

6 K (<p,*) = 6 K (0i S Sl ,PiS Cl ) = Pi6 K (6 Sl ,5 (l ) = - Ci|| = 

||/3i6-/3iCi|| = l|a-6|| 

where we have used the fact that dK{5 x ,8 y ) = \\x — y\\ for any pair of points x 
and y in K. 
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The case when a = b is trivial. Just take £fc = k = 1, 2, N and define 
\I/ = y>. In the remaining part of the proof we therefore assume that a =/= b. 

Let us now assume that we have proved the conclusion of the lemma for 
N = M — 1 and let us prove the conclusion for N = M, where M > 2. 

We first define the sets Si, S2 and S3 by 

51 = {iES: (a), > (6), }, 

5 2 = {iE S : (a), < {b) t }, 

5 3 = {ieS: (a) t = (b) t }. 
\\b\ \ both the sets Si and 5*2 are nonempty. Let us set 

Then clearly 

A = - («)^) = ii«- fo n/ 2 - 

J6S2 

Next let us consider the vector £m- We define 

fli = {i G Si : (6if ) t > 0}. 

Assume first that Ri = 0. Then define the vector ai, the vector ( m and the 
vector 61, by 

ai — a — (3m£m, 
Cm = Cm, 

and 

61 = b- (3 m C,m- 

For i G Si we now find that (ai)j = (a)j anc? (6i)j = (b)i since (£m)« = if 
i G Si. Therefore, we note in particular that (&i)i > 0, i G Si. Moreover, if 
i G S2 U S3 then by using the fact that (a)j < Vz G S2 U S3, using the fact 

that (m — £,m and using the fact that (ai), > 0, z G S we can conclude that 
(bi)i > 0, i G S. Thus both 6, 61 and Cm have nonnegative coordinates, which 
implies that 

INI = II&II-/3m||Cm|| 

since the norm || • || is defined by the li — metric; since furthermore ||a|| = ||6|| 
and ||Cm|| = ||£m||, we find that 

INI = IN-/?m||£m||- 
But since also < (ai)j < {a)i, i G S, it is clear that we also have 

lkll = NI-/«M||- 

Hence 

IKII = Holl- 



and 

Since a 7^ b and ||a|| = 
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Since (3k > 0, k = 1,2, M, it is clear that ||oi|| > 0, and hence we also have 
that H&ill > 0. Furthermore since Cm = Cm it is clear that 

||oi-6i|| = ]|oc — PmZm - b + PmCm \\ = \\a-b\\. (46) 

From the induction hypothesis now follows that there exists a set of vectors 
{(k G K, k = 1,2, ...,M-1} such that 

M-l 
k=l 

and such that if we denote 

M-l 

<P1 = E /3fc<5 ^ 

fc=l 

and define 

M-l 

*i = E fl^c* 
fc=i 

then 

dA-(v>i,*i) = ||ai-6i||. (47) 

Therefore, if we define 

M 

fc=i 

we find that 

d K (tp, *) = djf + /3 m <5 Cm , *i + /3m^c„) 
and from Proposition 3.1 it follows that 

d K (<p, *) < djc Oi , #i) + 0M^K (<^„ , 5 Cm ) = djf (pi , *i) (48) 

since by definition Cm = Cm- But from (|4"7| and P6"| follows that dxi'Pi, ^l) = 
1 1 a,i — fei| | = ||a — fe|| and hence 

d K (<P,*) < \\a-b\\ 

because of (|4B|). But &(</?) = a and fe(^E') = fe and therefore by Lemma 4.1 we 
also know that dxif, ^) > ||a — fe||. Hence 

d K {tp,V) = \\a-b\\, 

which was what we wanted to prove. 

Next consider the case when the set R\ is not empty. We shall again use Cm 

to define a new vector Cm- Roughly speaking what we shall do is to move as large 

part as possible from a coordinate (Cm)*) i G R\, to a coordinate (Cm)j> j G Sa- 
lience, we claim that there exist nonnegative numbers Uj, i G R\, j G S2 

with the following properties: 

E U.j = min{/3 M (Cm )i, ((a)< - (6)0}, V* G i?i (49) 
ies 2 
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and 

£ tij < ((&),• - (a)j), VjES 2 . (50) 

ie-Ri 

That such a set {tij : i G Ri, j G S2} exists follows from the following two 
observations: 

1) 

£((%-(<*),) = a, 

£ min{/M£M)», ((a), - (6),)} < £ ((o)i - (6),)} = A- 
ieRi ieSi 

We now simply define the vector Cm by 

(Cm)* = (ft* )i - (/3m)" 1 £ tij, i G i?i (51) 

(Cm),- = (Cm), + (/3m)" 1 £ tij, j G S 2 (52) 
(Cm), = (Cifer)i, */ i ^ i?i U S 2 . 

Since by definition 

£ *ij < /MCm)*, ViGi?i 

it is clear that (Cm)» > 0, Vi G i£i. That also (Cm)i > if z ^ i?i follows from 
the fact that (£m)i > for all z G 5. Hence 

(Cm), > 0, Vz g S 

from which we conclude that 

IICmII = Km); + E (Cm), + ]T (Cm), = 

i£Ri jeS 2 igR 1 US 2 

IICmII - £ (^m)"^ tij) + £ (/Sm)- 1 ^ M - IICmII 

iefli jes 2 jes 2 ieRi 

and hence 

IICmII- IICmII = 1. 

Now define 

M-l 

ai = a - /?mCm = £ /?feCfe 

and 

61 = b - PmCm- 

Obviously {a\)i > 0, Vz G 5. Since also (Cm)« > 0, Vz G S, it follows, since 
we are using the l\ — norm, that 

IKII = IMI - /?m||Cm||- 
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Next let us investigate whether (b\)i > for all i G S. First suppose 
i G Si\R\. Then (£m)i = and hence 

(6i)< = > 0. 
Next suppose that i G iZi. Then 

(&i)< = (&)« - 0m{(,m)% + ^ tj "i- 

Now if (3 M {£,M)i < (a)i - (b)i then X)jes 2 *»J = #m(£at)» and hence 

(6i)< = (6)i > 0. 

If instead PmHm)% > (a)i - (&)« then Yljes 2 = ( a )* - and hence 

(6i)< = (6)i-i9M(CM)i + (a)i-(6) < 

= (a)j - f3 M ((,M)i = (ai)i > 0. 
If i € Ss then (a)j = (b)i and (Cm)* = (£m)j and therefore 

(bx)i = (ax)i > 0, VzgS 3 . 

Finally if j G S*2 then 

(6i)j = (b^-MCuh = (b^M^j-J^^ ^ ( b )j-pM(Uh-((bh-(ah) 

ieRi 

= {a) j - /?Af(6f)i = {ai)j > 0. 

Hence 

{bi)i > 0, V i G S. 

Therefore, using the fact that = ||a|| and the fact that ||Cm|| = ||£m||, it 
follows that 

ll&ill = IHI-/3A/IICM|| = IHI-/3M||af|| = IK||. 

Let us also compute ||ai — If i G S\ \ R\ then (£m)i = and therefore 
it follows that 

(ai)i - (h)i = {a)i - (b)i > 0, V i G Si \R t . 

If i G Ri then 

(oi)i-(6i)i = (o)i-/3A,f ($M)t-(&)i+^M(^Af )»- ^ h,j = ( a )i~{b)i~^ - 

jes 2 jes 2 
because of flj5]) . If i G S3 then (a)j = (&)$ and (Cm)j = (Cm)z and therefore 

(ai)i - = 0, V i G S3, 

and if j G S2 then 

(61 )j - {ax) j = - Pm((m) 3 - (a)j + Pm{(,m)j 
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- (a)j - 53 ^ 

ieRi 

because of (|5U| . Hence 

IK -6111 = 253 ((6^ -(0)^-2 *W 

and since 

\\a-b\\ = 2j2((b) j -(a) j ) 

jes 2 

and 

2 51 hi = Pm\\Zm - (m\\ 
teRi, j'eSa 

because of (I5T1) and (l52l we conclude that 



|K-6i|| = ||a-&||-/3 m ||£M-CM||. (53) 

Now let us set 

M-l 

= 51 ^* ■ 

fc=i 

Since b\ is such that (fox)i > 0, Vi € S and ||6i|| = |K||, it follows from 
the induction hypothesis that there exists a set of vectors {Cfe G K , k = 
1,2,..., M-l} such that 

M-l 

h = 53 foe* 
fc=i 

and if we define 

M-l 

*i= 53 

k—l 

then 

= IK ~ kill- 
Since b — bi + PmCm we find that 

M 



6 =53 m, 

Defining 



k=l 



M 

fc=i 



it follows by Proposition 3.1 and (|53|) that 

IK-6i||+/3jif||f M -CAf|| = lk-6||. 
Again, because of Lemma 4.1, we also know that 

dx(V>.*) > lk-6|| 
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and hence 

d K (^) = \\a-b\\ 

and thereby the induction step is completed and the lemma is proved. □ 

We now state yet another lemma which is a simple consequence of well- 
known results from the general theory on probability measures on complete, 
separable, metric spaces. See e.g [TU] or [32], Chapter 2, section 6. We omit the 
proof. 

Lemma 4.3 For every /i G 'P(K) and every e > we can find an integer N , a 
sequence {a^, k = 1, 2, N} of real positive numbers satisfying 

N 

fc=l 

and a sequence {xk, k = 1, 2, N} of elements in K such that if we define 

N 
k=l 

then 

d K (p, v) < e. 

The following two results are simple consequences of Lemma 4.3, Lemma 4.1, 
Lemma 4.2 and the triangle inequality. We omit the details of the proofs. (When 
S is finite and consequently K is compact, then the conclusion of Corollary 4.2 
below is well-known from the general theory on barycenters. See e.g Proposition 
26.4 in 0.) 

Corollary 4.1 Let q £ K. For every fi € V(K\q) and every e > we can find 
an integer N, a sequence {a^, k = 1, 2, TV} of real positive numbers satisfying 

N 

^2a k = 1, 

fc=i 

and a sequence {xk, k = 1, 2, N} of elements in K such that if we define 

N 

v = ^2ct k 5 Xk 

k=l 

then 

v e V{K\q) 

and 

d K {p,, v) < e. 

Corollary 4.2 Let fie? and let q G K . Then 

d K (fi,V(K\q)) = \\b( tJ ,)-q\\. 

Before we prove Theorem 4.2 we shall state one more lemma that follows 
from the general theory regarding probabilities on compact, metric spaces. 
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Lemma 4.4 Let K' C K be a compact set and let V(K') be defined by 



V(K') = {v£ V(K) : v{K') = 1} 



Then for every e > we can find an integer N and a set 1Z = {v\, i>2, vn} of 
probability measures in V(K') such that for every v £ V(K') 

<Ik(v,1V) < e. 

We shall now prove Theorem 4.2. We repeat its formulation. 

Theorem 4.2. For each q £ K the set V(K \q) is a compact set in the 
topology induced by the metric dx (•, •)■ 

Proof. If the underlying set S is finite then the set K itself is compact and, 
as we pointed out in the beginning of this section, the set V(K\q) is a closed set 
and therefore, in this situation, V(K\q) is a compact set. 

Thus assume that S is an infinite set. To simplify notations we assume that 
S = N, the set of positive integers, and it is not difficult to convince oneself that 
this is no loss of generality. With this choice of S the set K is defined by 



Next let us also note that if q has finite support, that is: there exists an 
integer N such that (q)i = if i > N, then it follows that any measure v £ 
V{K\q) has support in the set K' = {x £ K : (x)i = 0, i > N}. Since K' is a 
compact set it follows, as above, that ^(Klq) is compact if q has finite support. 

It remains to consider the case when the vector q does not have finite support. 
Thus let q G K be given and assume that for every integer N there exists 
a number i > N such that q L > 0. Since (K,£) is a closed subset of the 
space l\, which is a complete, separable, metric space it follows that (K,£) is 
also a complete, separable, metric space. Therefore, V{K) is also a complete, 
separable, metric space. (From [TU] Corollary 11.5.5 it follows that V(K) is 
complete, and that V(K) is separable follows easily from Lemma 4.3.) Therefore 
in order to prove that the set V{K\q) is compact in the topology induced by the 
Kantorovich distance it suffices to prove that the set T-^if |g) is totally bounded. 

Thus let e, < e < 1/2 be chosen arbitrarily. What we shall do is to show 
that we can find a set J\f = {fii, /12, ^n} of probability measures in V(K\q) 
such that for any \i £ V(K\q) the following inequality holds: 
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K = {x= {{x)i,i= 1,2, ...) : (a;)i > 0,^(z)i = 1}. 



i=l 



d K (n,N) < e. 



In order to do this, let is define the number L by 



00 




n—m 



Next let us define 



L 



]f' = {iei{:^), = l}. 
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Clearly K' is a compact subset of (K, £). 
Let us also define A g by 

oo 

A 9 = 

and the vector € if by 

= («)</(l - A g ), 
(<z') 4 = ifi>L + l. 
Clearly 5' G if' and A g < e/6. Furthermore 

L 

ll« - ^ll =£((«)</(!- A,) -(g) 4 ) + A, = 
»=i 

L 

(A,/(l - A g )) 2(9)0 + A q = 2A q < 2e/6. (54) 
i=i 

Now, since if' is a compact set, it follows by Lemma 4.4 that there exist an 
integer N and a set 1Z = {v\, V2, i^n} of probability measures in V{K') such 
that for every v G V{K'\q') 

d K (v,n)<e/6. (55) 

Furthermore, since | } c?' — c? j | < 2e/6 (see (jS"4"|) ). and — g'H < e/6 because 

of (I55p and Lemma 4.1, it follows from Corollary 4.2 that we can find measures 
fij G V(K\q), j = 1, 2, AT such that 

dK{vj,^j) < 3e/6. 

Now set A/" = {/xi,/i2, —,hn}- We claim that 

d K {^M) < e 

for all jU G 7 3 (AT|(7). But this is almost obvious from the way we constructed the 
set A/\ For let /i G V(K\q). Since ||g - q'\ \ < 2e/6 it follows from Corollary 4.2 
that we can find a probability measure v £ V(K\q') such that 

d K {n,v) < 2e/6. (56) 
Then we can first find a measure v* G 72. such that 

djf(i/,i/*)<e/6 (57) 
and then find a probability measure /i* G M such that 

d K (y*,v*) < 3e/6. (58) 

Finally by the triangle inequality and the inequalities (|56[) . |57|) and (|58|) we 
conclude that 

d K {n,J^) < e 



31 



from which follows that the set V(K\q) is totally bounded which was what we 
needed to prove in order to prove that the set V(K\q) is compact. □ 

Theorem 4.1. Let S be infinite and let q G K. Then V(K\q) is tight. 
Proof. Follows from [10], Theorem 11.5.4. □ 

We end this section with the following lemma to be used later. For i G S 
and < r\ < 1, we define the set Ei{rj) by 

Ei{rj) = {x G K : [x)i > V }. (59) 

Lemma 4.5 Let S , be denumerable, let i G S, let q G K and suppose also that 
(q)i > 0. Then we can find a compact set C such that for all [i G V(K\q) 

n(Ei((q)i/2) C\C)> {q)i/3. (60) 

Proof. Let n G V{K\q). Since J K (y)ifi(dy) = (q) t and < {y) l < 1 if y G K 
one easily obtains the estimate 

^(0^/2)) > (q) l /2 

for all /i G ViKlq). Furthermore, since the set V(K\q) is tight, we can find a 
compact set C C K such that 

/i(C) > 1 - ( ? )i/6 
for all /i G P^lg). Therefore, if we set 

B(i) = Si(( ? )i/2), 
B c (z) = ^\ii; i (( (Z ) l /2) 

and 

C C = K\C, 

we obtain 

n C) = 1 - m(£ c («) u C c ) > 1 - A*(S c (i)) - ^(C*") = 
- (1 - ^(C)) > (q)i/2 - (q)i/6 - ( 9 )i/3. □ 

5 The barycenters of Markov chains induced by 
partitions. 

In this section we prove two more theorems concerning barycenters. The latter 
together with Theorem 4.1 of the previous section implies a useful tightness 
result. 

Theorem 5.1 Let S be a denumerable set, let M G Q{S) and let P G PM(S x 
S) be the associated tr.pr.m. Then, for all x G K, 

b(PJU5 x )=xP n , n=l,2,.... (61) 
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Proof. For an arbitrary i E S define m E C[K] by 

Ui(x) = (x)i. 

It then follows that 

(u u P M S x ) = (T M u h 6 x ) = T MUi (x) = Yl Ui(xM(w)/\\xM(w)\\)-\\xM(w)\\ = 

wew M{x) 

{{xM{w)) i /\\xM{w)\\)-\\xM{w)\\= J2 (xM{w))i = (xP)i 

from which follows that (jUTj) holds for n = 1. 

That (flH]) also holds for n > 2 now follows from the fact that Pfy = Pm™ 
(see Corollary 2.1) and the fact that A4 n is a partition of P n . □ 

We shall now prove the following interesting result which we also state as a 
theorem. 

Theorem 5.2 Let S be a denumerable set, let M. E G{S), let P be the associated 
tr.pr.m, let n E K and suppose that n — ixP. Then, 

P Mi iEV{K\-n), y fiEV(K\n). 

Proof. First assume that [i E V(K\tt) can be written 

N 

H = Y a k5y k (62) 

for some integer N, where > 0, k = 1, 2, N, J2k ak = 1> an< ^ D k & K, k — 
1-2 V. 

Next, let i G S be chosen arbitrarily, and let the function u E C[K] be 
defined by u(x) — Below we shall at a few places write 

U\ = T M u. 

What we shall prove is that (u, PmiA = Wi- Using the fact that 

u{ax + fly) = au(x) + (3u(y), x E K, y E K, a > 0, f3 > 0, a + f3 = 1, 
and the fact that 

{ax + f3y)P = axP + /3yP, x E K, y E K , a > 0, /3 > 0, a + /3=l, 
we find 

(u,P M fi) = (T M u,fi) = (ui,/i> 
r N 

fc=l 

JV 

J]a fe ^ U (y fc MH)/||2/feAfH||) • \\y k M(w)\\ = 

k=l weW M (y k ) 
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N N 

fc=i weWMivk) fe=1 w£W 

N N 
k=l k=l 

and thereby we have proved the assertion of the theorem when the measure /i 
can be written as in 

Next let fi £ V(K\ir) be chosen arbitrarily and let also e > be chosen 
arbitrarily. From Corollary 4.1 we know that we can find an integer N, a 
sequence {x±, %n} of elements in K, and a sequence {ot\, a.2, cvn} of 

positive numbers satisfying Y]j—i ctj = 1, such that if we define 

N 
3 = 1 

then 
1) 

v £ V(K\tt) 

2) 

d K {n,v) < e/3. 
Hence by Corollary 3.3 it follows that 

dK(T M [j,,T M v) < e. 

Since 7(14) = 1 when u is defined as above it follows that u £ Lipi[K] and 
since 

{T M u,v) = (ir)i 

it follows that 

\{T M u,n) ~ Wil < e 
and since e is arbitrary it follows that 

{T M u,n) = (7r)j. 

Since also i £ S was chosen arbitrarily, it follows that b(TM^) = ^ for all 
/i £ V(K\tt) which was what we wanted to prove. □ 

From Theorem 5.2 and Theorem 4.1 we now immediately obtain the following 
tightness result. 

Theorem 5.3 Let S be a denumerable set, letAA £ Q{S), let P be the associated 
tr.pr.m, let n £ K and suppose that nP = n. Then, for all /i £ V(K\tt) 

{P£,/i, n=l,2,...} 

is a tight sequence. 

Conjecture 5.1 Let S be denumerable, let M. £ G(S), let P be the associated 
tr.pr.m, let it £ K and suppose that nP — n . Then 

d K {PMli,PMv) < d K (n,is), £ V(K\ir). 
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6 An auxiliary theorem for Markov chains in 
complete, separable, metric spaces. 

In this section we shall prove a limit theorem for Markov chains in a complete, 
separable, metric space, which we in the next section shall apply to Markov 
chains generated by tr.pr.fs induced by partitions of a tr.pr.ms. 

In this section (K,£) will denote an arbitrary complete, separable, metric 
space, with metric 5 and a — algebra £. Other notations will be the same as 
before. For u E F[K] we define 

osc(m) = sup{|u(x) — u(y)\ : x G K, y € K}. 

As before 7 (it) denotes the Lipschitz constant defined by 

7(tt) = sup{|u(x) - u(y)\/6(x,y) : x € K, y E K, x^y}. 

Let Q : K x £ -> [0, 1] be a tr.pr.f on (K,£), and let Q n : K x £ -> [0, 1], n = 
1, 2, ... be the sequence of a tr.pr.fs defined recursively by 

Q l (x,B) = Q(x,B), Be£ 
Q n+1 {x,B)= f Q n (y,E)Q(x,dy), B 6 £. 

JK 

We let T : B[K] — > B[K] denote the transition operator associated to Q defined 
as usual by 

Tu(x) = I u(y)Q(x,dy). 

JK 

We define T°u(x) = u(x). Note that 

osc(T n+1 u) < osc(T n u), n = 0,1,2,..., u G B[K\. (63) 
If a measure /1 e V{K) is such that 

/ Q(x,B)n(dx) = n(B), VBe£ 

JK 

then is an invariant probability measure of Q. If Q is such that 

u e C[K] => Tu e C[K] 
the Q is Feller continuous. If a set T> C is such that for any two measures 

I u(y)n{dy)= I u{y)v(dy), V u e V 

JK JK 



u{y)^{dy) = / u{y)u{dy), Vug C[K] 

K JK 

then we say that the set T> is measure-determining. 

It is well-known that Lip[K] is measure-determining when (K,£) is a com- 
plete, separable, metric space. (From 22 , Chapter II, Theorem 6.1, follows that 
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the subset of C[K] consisting of uniformly continuous functions is measure- 
determining and since one, for any uniformly continuous function u and any 
e > 0, can find a function v E Lip[K] such that ||u — v\\ < e it follows that 
Lip[K] is measure-determining.) 

When stating and proving the forthcoming theorem we shall use the notion 
shrinking property which we introduced in the introduction. We repeat it here 
for sake of convenience. 

Definition 6.1 Let Q be a tr.pr.f, and let T be the associated transition oper- 
ator. If for every p > there exists a number a, < a < 1 such that for every 
nonempty, compact set A C K , every r\ > and every k > 0, there exist an 
integer N and another nonempty, compact set B C K such that, if the integer 
n > N then for all u € Lip[K] 

sup \T n u(x) — T n u{y)\ < rjj(u) + kosc(u) + apj(u) + 

x,y£A 

(1-a) sup \T n - N u(z 1 )-T n - N u(z 2 )\, 

z±,z 2 GB 

then we say that Q has the shrinking property. We call a a shrinking 
number associated to p. 

We now first prove the following lemma. 

Lemma 6.1 Suppose that the tr.p.f Q has the shrinking property. Then for 
every nonempty compact set C C K and every u S Hp[K] 

lim sup | / u{z)Q n (x,dz) - [ u(z)Q n {y , dz)\ = 0. (64) 
n ^°°x,yeC Jk Jk 

Proof. Let C be a given nonempty, compact set. Let also e > be given. In 
order to prove the lemma it suffices to prove that, for every u G Lip[K], we can 
find an integer N such that 

sup | f u(z)Q n (x 0l dz)~ [ u{z)Q n {y ,dz)\<Ae (65) 
xo,y ec Jk Jk 

if n > N. 

Thus let u € Lip[K] be given. Obviously ([64| holds if j(u) — 0. Therefore 
we may assume that j(u) > 0, which also implies that osc(u) > 0. 
We now define 

P = e/7(«)- 

Next let a > be a shrinking number associated to p. Define the integer M 

by 

M = min{m : (1 - a) m < e/osc(u)}. (66) 

From the shrinking property it now follows that if we choose r\ = r\\= e/(2j(u)) 
and choose k = k± = e/(2osc(u)), then we can find an integer N\ and a compact 
set Ai such that 

sup \T n u(x )-T n u(y )\< 
mi(u) + kiosc(u) + afrf(u) + sup \T n ~ Nl u(x) — T n ~ Nl u(y)\ < 
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e/2 + e/2 + ae + (1 -a) sup \T n ~ Nl u(x) —T n ~ Nl u(y)\ (67) 
for n > N\. 

Next setting rj = r\ 2 = (e/(4C7(u)) and k = k 2 = e/(4osc(u)), and again 
using the shrinking property, it follows that we can find an integer N 2 and a 
new compact set A 2 such that 

sup \T m u( Xl )-T m u{ yi )\< 
+ k 2 osc(u) + apj{u) + (1 - a) sup \T n ~ N2 u{x) - T n ~ N2 u(y)\ < 

x,y£A2 

e/4 + e/4 + ae + (1 -a) sup \T m - N2 u(x) ~ T m ~ N2 u(y)\ (68) 

x,y£A 2 

for to > N 2 . 

Hence, by combining (|67|) and (|68|) we find that if n > ATj + iVa and we put 
to = n — N± then 

sup \T n u(x )-T n u(y )\ < 
e/2 + e/2 + ae + (l-a) sup \T m u(x) - T m u(y)\ < 

x,y£A± 

e/2+e/2+ae+(l-a)(e/4+e/4+ae+(l-a) sup \T m ~ N2 u(x)^T m - N2 u(y)\) < 

x,y£A 2 

(e/2 + e/4) + (e/2 + e/4) + ae(l + (1 - a))+ 
(1-a) 2 sup iT^^+^yaO-T^-^+^O,)!). 

And in this way we proceed. Thus we define the numbers r\i, i = 1, 2, M 

by 

to = e/((2*)7(«)) 
the numbers «;,, i = 1,2, M by 

- e/((2>sc(u)) 

and having defined the compact sets Ai for i = 1,2, j — 1, and the integers 
AT i; for i = 1,2,..., j — 1, it follows from the shrinking property that we can 
define Aj and Nj such that 

sup \T m u(x) - T m u(y)| < J]jj(u) + Kj0sc(u) + ap-f{u)+ 
XfyeAj^! 

(1-a) sup |T m - JV %(a;)-r ro - JV -''u(j/)| 
if to > iVj . By induction follows that if the integer n satisfies n > Ni + N 2 + 



Nj then 



sup \T n u(x )-T n u(y )\ < 



J2 e/2* + J2 £ / T + ea ( 1 + (1 - «) + (1 - a) 2 + - + (1 - a) J " 1 )+ 
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(l-aY sup \T n ^ N ^+-+ N ^u(x)-T n -^+ N - + - +N ^u(y)\. 

In particular, if j = M and the integer n satisfies n > Ni + N2 + ... + Nm, then 

sup \T n u(x )-T n u{y )\ < 

xo,y eC 

M M 

+ e / 2 ' + ea ( 1 + (1 - + (1 - a) 2 + - + (1 - + 

i=l i=l 

(l-a) M sup |T n -*u(a;)-T B - Y tt(iO| 
where N = N\ + N2 + ■ ■ ■ + Nm , and by using (p33"]) and the fact that 

ea(l + (1 - a) + (1 - a) 2 + ...(1 - a) M ) < e, 
we find that if n > N then 

sup \T n u(x Q ) - T n u{y )\ < e + e + e + (1 - a) M osc(u), 

and since M is chosen in such a way that 

(1 - a) M osc{u) < e 

(see it follows that 

sup \T n u(x )-T n u(y )\ < 4e 

if rt > JV. Hence (|S5|) holds if n > N from which the lemma follows. □ 

Theorem 6.1 Suppose that the tr.p.f Q is Feller continuous, that Q has the 
shrinking property and that there exists a point x* € K such that the sequence 
{Q n (x* , -),n = 1,2,...} is a tight sequence of probability measures. Then Q is 
asymptotically stable. 

Proof. Since Q is Feller continuous and {Q n (x*, ■), n — 1,2,...} is tight it is 
well-known from the general theory on Markov chains that there exists at least 
one invariant measure for Q. 

That there is only one invariant measure v, say, is not difficult to prove by 
contradiction if one uses Lemma 6.1. For suppose both v and [i are invariant 
measures for Q. Suppose = 1, and that a > where a is defined by 

a = I u(x)fj,(dx) — / u{x)v{dx). 

JK JK 

Since we have assumed that (K, £) is a complete, separable, metric space both v 
and n are tight measures. (See e.g. [3], Theorem 1.4.) By choosing the compact 
set C sufficiently large it is clear that for every nonnegative integer n — 0, 1, 2, ... 
we have 

a = [ T n u(x)fi{dx) - [ T n u{y)v{dy) < 

JK JK 



38 



a/4+/ / \T n u{x) - T n u(y)\v(dx)v(dy) 
Jc Jc 

and then from Lemma 6.1 we can conclude that 

\T n u{x) - T n u(y)\^{dx)v{dy) < a/4 

c Jc 

if n sufficiently large and hence a < a/2 and we have obtained our contradiction. 
After one has proved the uniqueness, it suffices to prove that 



n — >oo 



lim / u(y)Q n {x,dy) = / u(y)v(dy) 

IK JK 



for all u in Lip[K], since this set is measure-determining. 
To do this let us first prove that 



lim \T n+1 u{x*) - T n u(x*)\ = (69) 

n — too 



for u G Lip[K]. Thus, let u 6 Lip[K] be given and let also e > be given. 
Choose the compact set C so large that (1 — Q(x* , C))osc{u) < e/2 . Since 



\T n+1 u(x*) - T n u(x*)\ = | / {T n u{y)-T n u(x*))Q{x*,dy)\ 

JK 

it follows that 

\T n+1 u(x*) — T n u(x*)\ < | / {T n u(y)-T n u{x*))Q{x\dy)\+e/2 

Jc 

and since 

| f {T n u{y)-T n u{x*))Q(x\dy)\<e/2 
Jc 

if n sufficiently large because of Lemma 6.1, statement (|69p follows. 

Then, since {Q n (x*, •), n = 1,2,...} is tight, it follows that there exists a 
measure /i, say, and an increasing sequence of integers nj, j — 1, 2, .... such that 



lim / u{y)Q n ^(x* 1 dy) = hm T n ^u(x*) = / u(y)n(dy) (70) 

i-»oo J K j-»oo J K 

for all ueC[K}. 

Now let u S Lip[K] and set u\ = Tu. Since Tu G Lip[K] if u G Lip[K], it 
follows from (JTUJ) that 

lim / u{y)Q n * +1 {x*,dy) = 

3^°° JK 

lim T"^ +1 u(.t*) = lim T n 'ux{x*) = ( ux{y)^(dy) 

j^oo j^oo J K 

which together with (|6"9")) and ((70)) implies that 

u{y)u{dy) = / Tu(y)/j,(dy). 

K JK 
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Since Lip[K] is measure-determining it follows that /x is an invariant measure of 
Q and therefore /i = v since we had proved that v was the only invariant measure 
of Q. But since the right hand side of f7U|) is independent of the subsequence, 
it follows that we in fact have 



lim / u(y)Q n (x*,dy)= lim T n u(x*) = / u{y)v(dy). (71) 

n~,ooJ K n-,00 J R 

Finally let x 6 K be chosen arbitrarily. Since 

lim T n u{x*) = { u{y)v{dy), 
™-oo J K 

it follows from Lemma 6.1 that also 

lim T n u(x) = / u(y)i/(dy), 
n->oo J K 

and hence 

lim / u{y)Q n {x,dy) = u{y)v{dy) 

for u in Lip[K], and since this set is measure-determining it follows that Q is 
asymptotically stable. □ 

We end this section introducing the following terminology 

Definition 6.2 Let Q be a tr.pr.f and let T be the associated transition operator. 

(i) If there exists a positive constant L such that for every u € Lip[K] 

\Tu(x) - Tu{y)\ < Lj(u)S(x, y), 
then we say that Q is Lipschitz continuous. 

(ii) If there exists a positive constant L such that for every u € Lip[K] 

\T n u{x) - T n u{y)\ < L~f(<u)6(x, y), n = 0, 1, 2, .... 
then we say that Q is Lipschitz equicontinuous. 

Proposition 6.1 Let S be a denumerable set, let A4 € G(S) and let Tm de- 
note the transition operator on K induced by M. Then Tm is both Lipschitz 
continuous and Lipschitz equicontinuous. 

Proof. Follows immediately from Corollary 3.1 and Corollary 3.2. □ 



7 The proof of Theorem 1.1. 

Let us first repeat the definition of Condition B introduced in the introduction. 

Condition B. For every p > there exists an element Iq € S such that if 
C C K is a compact set satisfying 

H{C n {x : (x) io > (7r) iD /2}) > (7r) i0 /3, V /x e V(K\n), 

then we can find an integer N, and a sequence {wi,W2, ...tujv} of elements in 
W, such that, if we set 

M(w N ) = M(w 1 )M(w 2 )-M(w N ), 
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then 

||e 4 °M(w N )|| > 
and i/i£Cn{i: ( x )io > ( 7r )io/^} then also 

||(a;M(w N )/||a;M(w N )|| - e l0 M(w N )/||e l °M(w N )||)|| < p. 

We now repeat the formulation of Theorem 1.1 using the notations intro- 
duced in section 2. 

Theorem 1.1 Let S be a denumerable set, let 7r G K be positive, let M = 
{M(w) : w G W} G Q%(S) and suppose that Condition B holds. Then Pm * s 
asymptotically stable. 

Proof. As usual wc denote the tr.pr.m associated to M. by P, and since 
Ai G Gtt(S) we know that ttP = tt. 

Since (K, £) is a complete, separable, metric space it suffices to prove that 
Pm satisfies the hypotheses of Theorem 6.1. 

That Pm is Feller continuous follows easily from the fact that Pm is Lips- 
chitz continuous (see Proposition 6.1) and that Lip[K] is measure determining. 
From Theorem 5.3 we conclude that {P^(7r, •), n — 1, 2, ...} is a tight sequence 
since M G Gtt(S) and 5^ G V(K\n). 

It thus remains to show that Pm bas the shrinking property. To simplify 
notations we shall throughout the rest of this proof denote the transition prob- 
ability function Pm by P, denote the transition operator Tm by T and the 
transition probability operator Pm by P. Also recall that we have introduced 
the notation [x] = x/\\x\\ when ||x|| ^ 0. (See ([2"5]) .) 

Thus let p > be given. What we have to do is to show that we can find 
a number a > such that for each nonempty compact set A and each n > 
and each n > we can find an integer N and another nonempty compact set B 
such that for each u e Lip[K] 

sup \T n u(x) — T n u{y)\ < rjj(u) + kosc(u) + apj(u)+ 

x,y£:A 

{I -a) sup \T n - N u(z)-T n - N u{z')\. (72) 

Let A be a given nonempty compact set, and let also n > and k > be 
given. From Corollary 3.2 we know that for every u G Lip[K] 

\T n u(x) - T n u{y)\ < 37(105(1, y), n = 0, 1, 2, ... . 

Let us next recall, from the general theory of Markov chains, that for any 
z G K and any e > 0, we can find an integer N' such that if n > N' then 

||*P*-7r||<e. 

(See e.g [H], Chapter 2.) Since the set A given above is compact, it follows that 
we can find an integer N\ such that for all z G A 

\\zP n -ir\ \ < 17/6 

if n> ATi. 
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Now let x £ A and y £ Abe given. Set 

/*»,*(■) = P n (av), n = 1,2,... 

and 

M „,3,(-) =P n (i/,-), n=l,2,... . 

From Theorem 5.1 follows that b(p, n .x) — xP n and that b(p n ,y) = yP™. There- 
fore, if n > N\, where N\ is defined as above, we conclude that 

\\HHn,x) ~ 7r|| < ?7/6 

and that 

UK/Viz) - < »?/6- 

From Corollary 4.2 now follows that we can find two measures v x and v y , both 
in V(K\n), such that if u £ Lip[K] then 



I / u{z)n NuX {dz) - / u(z)z^(dz)| < 

JK JK 



j(u)t)/6 



and 



/ u(z)fj, Nuy (dz) - / u(z)^(cb)| < 7(u)7?/6. 
From Corollary 3.2 we also find that for m = 0, 1, 2, ... 

I / T m u(z)p NuX (dz) — f T m u(z)v x (dz)\<3 1 (u)r 1 /6 = 1 (u) V /2 

JK JK 

and similarly that 

TZu(z) muV (dz) - f T%u{z)v v {dz)\ < 37(u)»7/6 = 7(«)»y/2. 
Thus if n > iVi we have 

\T n u(x) - T n u(y)\ <r,l{u)+ 

-™- N l„,(-,\,, (rl-,\ _ / rpn — Ni 



T n - Nl u(z)v x {dz)- / T"- iVl u(z)^(dz)|. (73) 

AT JK 

We shall next define a shrinking coefficient a > associated to the given 
number p. To do this we shall use Lemma 4.5 and Condition B. 

From Lemma 4.5, it follows that for each i £ S we can find a compact set 
C t such that for all fi £ V(K\ir) 

H{d n£?i((7r)i/2))> (7r)i/3. (74) 

From Condition B it follows that we can find an element io € S, an integer 
iVo and a sequence {w\, W2, ...wn } of elements in W depending on the set Cj , 
such that if we set M{w 1 )M{w 2 )~-M(w Na ) = M(w N °) then 

||e 4 °M(w N °))|| > 

and 

||([*M(w N °)]-[^M(w N °)])||<p/6 Vx£E i0 ((ir) i0 /2)nC i0 . (75) 
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Next let us define a.\ by 

ax = ((^73). ((^,0/2). 1^(^)11, (76) 

and let us define 

a = a\/2. 

Our aim is thus to verify |72|) with this choice of a and with N = Ni + Nq. 
In order to do this let us first set 

< = P No v*, 

let us write 

K {2) = K x K 

and 

£ (2) = £®£, 

and let v* x y denote the product measure on (K^ 2 \ determined by v* and 

* 

v y . 

Furthermore let us denote 

q = e io M(w No )/\\e io M(w N °)\\ 

and 

D = {zeK:5(z,q )<p/6}. 
Since (z) lQ > (tt); /2 if z G E io {{ii) io /2) and 

\\xA\\ > (^lleMH, V*eS, (77) 

if ^4 is a nonnegative matrix and 1 £ Jf , we conclude from ([74]) , ([75]) . (|T6"|) and 
((771) that 

^(I>) > (Wi /3) • (W !o /2) • ||e*°M^(w^)|| = ax 

and also that 

v y(P) > (Wio/3) ■ ((7r) i0 /2) • ||e*°M^(w^)|| = ai . 

Since z^, t'j, G V(K\q), v* = P N °v x and v* = P N °v y it follows from Theorem 
5.2 that 2/*, v* G 'P(i ; C|g). Since 'T 1 ' ( -?^r | is a tight set it follows that we can find 
a compact set B independent of x, y G A such that 

v* x (DC\B) > ai/V2, 
v*JDC\B) > ai/V2, 



and also 

v* (B) > 1 — k/2 
y* y {B) > 1 - k/2. 
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Therefore, if we denote 

= {(*,/) e if (2) :z£B,z'eB} 

and set 

B^ ={(z,z')eB^ :6(z,z')<p/3}, 
we can conclude that 

KJB^) > v*(D D B) ■ v* v (DDB) >a\/2 = a, 

and that 

^(5 (2) )>(l-«/2) 2 >l-«. 

(2) (2) 

Next let us define the sets B 2 and B 3 by 

B ( 2 2) ={(z,z')EB^ : <5(z,z')>/>/3} 

and 

bP = k^\bW. 

Obviously 

Bk ] n B!£) = 0, 1 < fc < m < 3 

and 

[JbP = kW. 

i=i 

We shall now estimate 

| / T"- Ari u(z)^(dz) - f T n - Nl u{z)u y {dz)\ 

J K JK 

for n > Nx + N = N. We set 

m = n — N and v — T m u. 
Note first that if n > N then 

| [ T n - Nl u{z)v x (dz) - f T n - Nl u(z)is y (dz)\ = 

JK JK 

\{T m+N °u,v x )-{T m+N °u,v y )\ = 
\(T m u,v*} - (T m u,i/*)| = 
\{v,v* x )-{v,v* y )\. 
Next let us note that since T is a transition operator it follows that 

osc(v) = osc(T rn u) < osc(u), 

and from Corollary 3.2 we also find that 

j(v) = j(T m u) < 3 7 (w). 
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Since v* is the product measure of v* and v* it is clear from (|8T)|) that 
| f T n - N 'u{z)v x (dz) - [ T n - N 'u{z) Vy {dz)\ = 

JK JK 

(v(z)-v(z'))u* x Jdz,dz')\. (83) 

(2) 2) (2) 

From the definitions of the sets B\ , B 2 and B$ we then find that 
\f (v(z)-v(z'))Dl y (dz,dz')\< 



{v(z) - v(z'))D* {dz,dz')\ + 
j r Jv(z)-v(z'))i>l y (dz,dz')\+ 

(v(z)-v(z')KJdz,dz% (84) 



(2) 



We have already proved that 
and hence 

I / „i v ( z ) - v ( z ')) i> x,v( dz ^ dz ')\ <osc(v)k. (85) 
J b ( 3 2) 

Next let us write 

6 = sup{|u(z) - v{z')\ :z&B,z'e B}. (86) 

Then 

| f (v(z)-v(z')K ty (dz,dz')\ < QKJB { 2 ] ) < 
Jb 2 

9(1 -KJB^)). (87) 
But if0<a</9<l,e>0 and 9 > 0, then by elementary calculations we find 

/?min{e, 9} + (1 - /3)Q < ae + (1 - a)B. (88) 
Since v* x , y ( B \ } ) > a because of ([71]), it follows from (JHTJ) and J5SJ) that 



min{ 7 ( V )p/3, 9} • P;„(b{ 2) ) + 9(1 - K^)) 
ay(v)p/3 + (1 - a)e 

45 



< 



which combined with ((Ml), (El), (ESI), l[5Tj ) . and ([52] ) . implies that 

| f T n ~ Nl u{z)v x (dz) - [ T n - N ^u{z) Vy {dz)\ < 
Jk Jk 

osc(u)n + aj(u)p+ (1- a)0. (89) 

Hence, from ([75]). (85]) and the definition of 9 (see (85]) and O ) it finally 
follows that 

\T n u{x)-T n u{y)\ < 

j(u)n + osc(u)k + aj(u)p+ 

(1 - a) svLp{\T n ~ N u(z) - T n ~ N u{z')\ :zeB iZ 'eB} 

and since x and y were arbitrary chosen in A, it follows that (f72|) holds for all 
u £ Lip[K] and hence the tr.p.f P m has the shrinking property which was what 
we wanted to prove in order to complete the proof of Theorem 1.1. □ 

8 Exceptional cases. 

We first repeat the example presented in [16]. 

Example 8.1 (\Wj) Let S = {1, 2, 3, 4} and define P E PM(S x S) by 

/ * * \ 

0*0* 

* * 
\ ★ ★ / 

where each * denotes the value 1/2. Obviously P is aperiodic and irreducible. 
Let M. = {M (1), M(2)} be a partition of P such that 1) the first two columns of 
M (1) and P are equal and 2) the last two columns of M(2) and P are equal. Let 
S' = {1,2}. It is not difficult to show that the hypotheses of Theorem 1.2 are 
fulfilled with this choice of S' , and therefore Pm is not asymptotically stable. 

Kesten's counterexample to Blackwell's conjecture is as follows. 

Example 8.2 (Jltf) Let S = {1, 2, ...,8} and define P E PM(S x S) by 

/*000*000\ 

0*000*00 

000 * 000* 

00*000*0 

*000000* 

0*0000*0 

000**000 
\00*00*00/ 

where each * denotes the value 1/2. Obviously P is aperiodic and irreducible. 
Let M. — {JVf(l), M(2)} be a partition of P such that 1) the first four columns 
of M(l) and P are equal and 2) the four last columns of M(2) and P are equal. 

Let S' — {1,2,3,4}. It is again not difficult to show that the hypotheses 
of Theorem 1.2 are fulfilled with this choice of S', and therefore Pm is not 
asymptotically stable. It is also not difficult to verify that if, for example x E Ks> 
is such that < (x)\ — {x)s < (x)^ = [x)±, then Pm{x) is a periodic Markov 
chain taking its values in a subset of K consisting of just 8 elements. 



46 



We shall now prove Theorem 1.2. We repeat its formulation for convenience, 
using the notation x n (w n ) = [a;M(w n )] = xM(w n )/||xM(w n )|| introduced in 
section 2. 

Theorem 1.2 Let S be a denumerable set, let P e PM(S x S) and let 
M = {M(w) : w e W} be a partition of P. 

Suppose that there exists a subset S'cS consisting of at least two elements, 
such that 

1) for every x G Kg' the set U^L 1 K(x, Ai n ) consists of isolated points, 

2) if both x and y are in K$> then Wm™( x ) — VVx«(y), n = 1,2, .... 
and 

3) if x and y in Kg* , then 

IMw n )-y„(w n )|| n>l, w n e W M n(x). 

If these conditions are fulfilled then Pm is n °t asymptotically stable. 

Proof. Let S' C S be as in the hypotheses of the theorem. Let x <G Kg' and 

set 

K'(x) = U^ 1 K(x,M n ). 

Because of hypothesis 1), the set K'{x) consists of isolated points, and because 
of hypothesis 3) it is not difficult to convince oneself that K'(x) must contain 
at least two points. Therefore 

e = inf{||zi - 2 2 1 1 : zi,z 2 € K'(x), z\ ^ z 2 } > 0. 

Since x € Kg* and S' consists of at least two elements, we can find an element 
y G Ks> such that \\x — y\\ = e /2. 

Now let n denote an arbitrary positive integer, let 

Jfi = K(w n ):w n eW M »(x)} 

and 

if 2 = Ww n ):w"e W M n(x)} 

Since the points in K'(x) are isolated points, it is clear that K\ is the support 
of P^,(x, •) and since Wm™(u) — Wm"( x ) because of hypothesis 2) and also the 
set U^ , =1 K(y,M n ) consists of isolated points, it is clear that K 2 is the support 
ofP^O/,-)- Since 

inf{5(2,^i) : z e K 2 } = e /2 
because of hypothesis 3), it therefore follows that the Kantorovich distance 

d K (V n M {x,-),V n M {y,-))>e /2>Q 

and since n was an arbitrary integer 

d K (V n M {x,-),V n M (y,-))>e»/2 

for n = 1,2, ... which implies that the tr.pr.f P^('> can n °t be asymptotically 
stable. □ 

We shall next describe a family of tr.pr.ms for which one, for each matrix 
belonging to the family, can find a partition such that the induced tr.pr.f is not 
asymptotically stable. 
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Let X denote a denumerable set, let d > 2 be an integer, set I4 = {1, 2, d}, 
and define the set S as the Cartesian product of J and Id, that is 

S = {(i,j),ieX,jeI d }. 

Let A e PM(I x X), and let Al = {M (w) : bca partition of A. 

Next let Perm(d) denote the set of d x d permutation matrices. For each 
w e W and each (i, fc) elxl we now associate a matrix fc, w) e Perm(d). 
We write 

fii,w = {<?(*> *>«>): Melxl, wew}. 
We define the set 

M' = {M'(io) : tu G W} 

of S x S 1 matrices by 

(M'(w)) (iJ);(fe;m) = (M(w)) a • (Q(i, fc, w)) j>ro , (90) 
and define the S x S matrix P by 

,(fe,m) • 

(91) 

Proposition 8.1 TTie matrix P belongs to PM(S x 5). 
Proof. Obviously (P)(ij).(k, m ) > 0- It remains to show that 

^2( P )(i,j),(k,m) = 1, V(i,j) € S". 
fc.m 

Thus let (i, j) e 5 be given. From the definition we know that 

fc,m fc.m w£W 

and interchanging the order of summation we obtain 

fc,m «j£W fc m 

Since <5(i, fc, w) is a permutation matrix it follows that 

^(Q(i,fc,w)) Jim = 1. 

m 

Hence 

E( p )w)>(*>™) = E E( M H)^ = E E = E( p )^ = L D 

fc,m i»ew fc fc wew fc 

We call P the tr.pr.m generated by A and <2z,w- 
Corollary 8.1 The set M! is a partition of P. 



48 



Proof. Follows from JH}. □ 

Next suppose that the partition M. = {M(w) : w € W} of A is such that 

(M{w)) iik > {M(w)) iM = 0, if k x £ k, VM(w) e M. (92) 

Proposition 8.2 Let T denote a denumerable set, let A £ PM(I x I) and let 
M = {M(w) : w G W} be a partition of A that satisfies l9~2^ . Let d > 2, set 
Id = {1, 2, d}, and let 

Qi,w = {Q(i,k,w) : (i,j)eXxl, weW}. 

Let S — {(i,j),i G T,j € Id}, let P € PM(S x S) be the tr.pr.m generated by 
A and Qx.yy an d let M! be the partition of P defined by \9(ty . 
Then M' satisfies the hypotheses of Theorem 1.2. 

Proof. The proof is based on the following observation. For i g I, let — 
{(hj)J = l,2,...d}. Let x £ K s <. and suppose ||xM'(u;)|| > 0. From ^ 
and (jnni) follows that xM'(w)/\\xM'(w)\\ e K S ' where thus k is such that 
(M(w))i_k > 0. Furthermore, if we let x' denote the d- dimensional vector 
defined by (x')j = (x)i.j, j = 1,2, ...,d, set z — xM'(w)/\\xM'(w)\\ and let z' 
denote the d-dimensional vector defined by (z')j = {z)k.j, j = 1,2, then 
z' — x'Q(i 7 k, w). Since i was arbitrary it now easily follows that the hypotheses 
of Theorem 1.2 are fulfilled. • 

It is easy to show that both Example 8.1 and Example 8.2 can be put into 
the framework of the class just described. We show this for Example 8.2. 

Example 8.3 We need to define a denumerable set 2, an integer d, a tr.pr.m 
A G PM(TxT), a partition M. of A that satisfies i92\) . and a set of permutation 
matrices. We choose T ~ {1,2}, define the matrix A in the simplest possible 
way by 

{A) i>k = l/2, i = l,2 k = l,2, 

and let the partition M. = {A/(l), M (2)} of A be defined such that the first 
column of M(l) is equal to the first column of A and the second column of 
M(2) is equal to the second column of A. Clearly M satisfies \92) ). 

We choose d = 4. The state space S is thus {1, 2} x {1, 2, 3, 4} which consists 
of 8 elements. It remains to determine the set Qx,w of permutation matrices. 
We choose the permutation matrices independent of w G W. Therefore we 
denote the permutation matrix Q(i,j,w) by Q(i,j). We thus have to determine 
four permutation matrices Q(i,k)), i = 1,2 k = 1,2, of format 4x4. We 
define Q(l, 1) by 

(Q(l,l))„ =l, j = l,2 

(Q(1,1)) M = (Q(1,1))4,3 = 1, 

and 

{Q{l,l))j ; m = otherwise, 

We let the matrices Q(l,2) and Q(2,l) be defined in the same way as Q(l, 1) 
and finally we define the matrix Q(2, 2) by 

(Q(2,2)) M = (Q(2,2)) 2 , 3 - (Q(2,2)) 3 ,i - (Q(2,2)) 4 , 2 = 1 
(Q(2,2))j, m = otherwise. 
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It is easily checked that the matrix P generated by A and Qxyv is aperiodic and 
irreducible. 

The reason that for some (most) initial values the tr.pr.f P^vi(x, •) is a 
periodic Markov chain is due to the fact that all permutation matrices in the 
set Qxyj correspond to odd permutations. 

Our last example can be considered as a separate class of examples. 

Example 8.4 Suppose that S is finite set of size d>2, and that P G PM(S x 
S) is doubly stochastic (the transpose of P does also belong to PM(S x 
S)). It is well-known that each such matrix P can be written as ~Y^ = iCi w Q w 

where a w > 0, w — 1,2,. ..N, Ylw=i a w = 1 ana each Q w ,w = 1,2, ...N is a 
permutation matrix. If we now simply define M — {a w Q w : w = 1,2, ...,N} it 
is not difficult to prove that the hypotheses of Theorem 1.2 are fulfilled. 

Furthermore if we define T = {1}, let A denote the lxl matrix whose only 
element is equal to one, redefine M. by M. — {a w A : w — 1,2, ...,N}, let d 
be the same as before, define Qj = {Q w , w = 1,2, ...N}, let M! be defined by 
\9U\) with M{w) = a w A, and let P be defined by \91\) . we see that this example 
belongs to the class considered above. 

9 On Condition B. 

In this section we shall first prove that Condition Bl ( see subsection 1.5) implies 
Condition B. Then we shall make a precise statement of the condition introduced 
in 19J which we have called the "rank 1 condition", and we prove that if the set 
5" is finite then this "rank 1 condition" implies Condition Bl. 

We also introduce a condition which we call Condition P, which is adapted 
to Perron's classical theorem regarding matrices with positive elements (see e.g 
[T5j . vol II, Theorem 8.1), and prove that Condition P implies Condition Bl. 

A condition which seems easy to check in practice, when the space S is finite, 
is Condition A, introduced in [16]. In the last part of this section we prove a 
result based on a condition which can be regarded as a slight generalization of 
Condition A of [IS]. 

Proposition 9.1 Let S be a denumerable set, let A4 = {M(w) : w £ W} S 
G'(S), let P 6 PM(S x S) be the associated tr.p.m, and let it be the positive 
vector in K that satisfies nP = tt. Then, if Condition Bl is satisfied, it follows 
that Condition B is also satisfied. 

Proof. In order to prove Proposition 9.1 we shall first prove the following 
lemma in which we formulate some simple inequalities for matrices approaching 
a matrix in the set W. (For the definition of the set W see subsection 1.3.) 

Lemma 9.1 Let u £ U, d£ K , and define W — u c v. Let io be such that 

(u) io > 0. 

Let {W n ,n = 1, 2, ...} be a sequence of matrices of the same format as W , and 

assume that 

1) for n= 1,2,... 

| ] W„ 1 j = 1, 



50 



2) 

lim \\e\W n - W)\\ = 0, i € S. 

n— >oc 

XTien 

-/J £o every e > and every nonempty compact set C <Z K there exists an integer 
N = N e ,c such that if x € C i/ien /or a// integers n > N 

\(\\xW n \\-\\xW\\)\<e; 

2) to every 77, < 77 < 1. £/iere exists an integer N v such that if x E K is such 
that 

(aOio > »7 

£/ien /or all integers n > 

||scW„|| > (u) io 77/2 ; 

5J <o every r], < 77 < 1 cmd every nonempty compact set C <Z K and every 
7 > 0, there exists an integer N = N^ t c,ri such that if ' x € C and also (x)i > 77 
then ||a;W n || > for all integers n > N and furthermore 

\\xw n /\\xw n \\-v\\< r , 

4) to every 77, < 77 < 1 and every nonempty compact set C C K and every 
7 > 0, there exists an integer N — N-y_c.r/ such that if x € C and y £ C are such 
that (x)i Q > 77 and (y)i > 77 then ||iW n || > and \ \yW n \\ > for all integers 
n > N , and furthermore 

\\{xW n /\\xW n \\-yW n /\\yW n \\)\\< 1 . 

Proof of Lemma 9.1. Let e > and the nonempty compact set C C K be 
given. Since C is compact and K is a complete, separable, metric space we can 
find a set Af of finitely many points x±, X2, ■ Xm such that for any x € C there 
exists an element G J\f such that 

||a? 3 - -as|| < e/3. (93) 

From hypotheses 1) and 2), and the fact that 

= 1 < 00, V Xj S A/" 

it follows that we can find an integer iV = AT^c such that if n > N then 

max{||a; i (W n - W)|| : x 3 eJV}< e/3. (94) 

Since ||W„|| = 1 for n = 1, 2...., we conclude from (JM]) and fM]). that if a; € C 
and Xj £ A/" satisfies ||xj — x\\ < e/3 then 

\(\\xW n \\-\\xW\\)\<\\x(W n -W)\\< 

\\( x - Xj ){W n -W)\\ + \\x j (W n -W)\\<2\\x-x j \\ + e/3<e, 
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if n > N, which proves part 1) of the lemma. 

Next consider H^W^H when x S K is such that (x)i > r\. We have 

\\xW n \\ > (a;) io ||e io W„|| = {x) Hs \\e Hi W n - e Hi W + e io W\\ > 

(x) la \\e l0 W\\ - (x) io \\e io W n - e io W\\. 

But 

(a:)i ||ei W|| >r]u io (^2v k ) = <qu io 
kes 

and 

{x)io\\e io W n - e io W\\ < (u) lo ri/2 
if n is sufficiently large because of hypothesis 2) of the lemma. Hence 

W n -e io W\\ > 

7](u) l0 - (u) lo rj/2 = (u) lo r]/2 

if n is sufficiently large and thereby we have proved part 2) of the lemma. 

To prove part 3) of the lemma, let 7 > be given, let 77, < rj < 1 be given, 
and let the nonempty compact set C be given. Let xeCbe such that 

(^k > V- 

Now for any y £ K for which | \yW\ | > we have 

2/W/||yW|| = yu c v/\\yu c v\\ = (y, u c )v/\\(y, u c )v\\ = v/\\v\\ = v. 
Since (x)i > rj > and Ui > 0, we clearly have 1 1x14^11 > r)Ui > and hence 

xW/||a;W|| = v. 

Next let Nq be so large that | |xW n | | > (u)i rj/2 for all x £ K for which (x)i > rj. 
That we can find such a number follows from part 2) of this lemma, which we 
just proved. Now by using inequality (|38|) of Section 3 we can conclude that 

\\{xW n /\\xW n \\)-v\\ = \\(xW n /\\xW n \\-xW/\\xW\\)\\ < 

2\\xW n - xW\\/m^{\\xW n \l\\xW\\} < 2\\xW n - xW\\(2/(u) loV ) (95) 

if n > No, if x £ C and if (x)i > 77. From part 1) of this lemma then follows 
that we can choose an integer iVi > No so large that if n > Ni then 

\\xW n -xW\\<7-((u) io r)/4:). (96) 

Hence, by combining (|95|) and |96|) we can conclude that 

\\(xW n /\\xW n \\-xW/\\xW\\)\\ < 7 

if x G C, (x)i > 77, and n > N\ and thereby we have proved part 3) of the 
lemma. 

Part 4) finally follows trivially from part 3) of the lemma and the triangle 
inequality. □ 
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We now continue the proof of Proposition 9.1. Our aim is to verify Condition 
B. (See Definition 7.1.) Thus let p > be given. What we shall prove is that 
we can find an element io £ S, such that if C is a compact set such that 

/i(Cn£ io ((7r) io /2))>(7r) io /3 (97) 

for all p £ V(K\tt), then we can find an integer N and an element w N £ W N 
such that 

||e l °M(w N )|| > (98) 

and 

||[xM(w N )] - [e l «M(w N )]|| <p, Vx e ^((tt) <0 /2) n C. (99) 

Since Condition Bl is satisfied there exist a vector u £ U, a vector v £ K, a 
sequence of integers {raj., 712, •••}) an d a sequence {\v. J , j = 1,2, ...} of elements 
in yV™ J respectively, such that ||M(w" J )|| > 0, j = 1,2,... and such that if wc 
define W — u c v then for all i £ S 

lim ||(e l M(w" j )/||M(w" j )||) - e l W\\ = 0. (100) 

j — * CKj ^ ^ 

Let us choose io £ S such that (u)i > and let C be a compact set such 
that (|SZ|| holds for all /Lt € P^K). Since (u) io > it follows that ||e l °Vy|| = 
(u) io > 0. By (TTOOl then follows that ||M(w" J )|| > 0, j = 1,2, ... if we let the 
enumeration n\,n-z, ... start with a sufficiently large nj. Since obviously 

||(M(w; j )/||M(w J nj )||)|| = l 

and C U {e 40 } is a compact set, it follows from assertion 4) of Lemma 9.1 that 
if j is sufficiently large then 

||[o:M(w; j )]-[e i; «M(w J nj )]||<p 

Vxe£; ((7r) io /2)nC 

and hence f9"9"]) holds which was what we wanted to prove. □ 

The following condition was introduced in |19j by Kochman and Reeds for 
the case when S is finite. We call it Condition KR. 

Condition KR. Let S be a finite set, let P £ PM{S x S), let A be another 
finite set and let R be a transition probability matrix from S to A. Let M. = 
{(M(a),a £ A} be a partition of P determined by 

(M(a)) id = (P) id (R) jta . 

(As we pointed out in subsection 1.2 it is easily checked that M. = {{M{a),a £ 
A} is a partition of P.) 
Define 

M* - U™ =1 M n , 

C = {C = aM :a> 0, M £ M*} 

and let the set C be defined as the closure of C under the usual topology in 
R SxS 

If C contains a matrix o/rank 1 then we say that Condition KR is satisfied. 
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Corollary 9.1 Let S be a finite set , let P be a tr.p.m in PM(S x S) which 
is irreducible and aperiodic and let M be a partition of P. Then Condition KR 
implies Condition B. 

Proof. Because of Proposition 9.1 it suffices to prove that Condition KR implies 
Condition Bl. In order to do this first note that if W is a finite-dimensional, 
square, nonnegative S x S matrix, then W can be written W = u c v where u is a 
nonnegative S — dimensional vector satisfying max(u); = 1, and where v £ K. 

Suppose now that Condition KR is satisfied. The we can find a nonnegative 
matrix W of rank 1, a sequence of integers ni,n2, a sequence wj n , j = 
1,2,... of elements in yV nj ,j = 1,2,... and a sequence Xj, j = 1,2,... of real 
numbers, such that 

lim ||M(w. (nj) )/*J - W\\ = 0. (101) 

By using the triangle inequality, the norm inequality |(||A|| — ||-B||)| < | \A — B 1 1 , 
and the fact that 1 1 | = 1, it is easily proved that we then also have 

lim ||M(w[ nj) )/l|M(w| nj) )|| - W\\ = (102) 

j — >-oo •* 

from which Condition Bl follows, since for any two matrices of the same format 
\\xA-xB\\<\\x\\(\\A-B\\). □ 

Next two more notations. Let S be a denumerable set. For an arbitrary 
matrix M = {(M) hj , i £ S, j £ S} we define 

Si(M) = {i£ S : (M) iyj > 0, some j £ S} (103) 

S 2 (M) = {j £S: (M) itj > 0, some i £ S}. (104) 

What makes the case when S is finite somewhat easier to handle is that one 
can use Perron's theorem for finite dimensional matrices with positive elements 
(see e.g [13], vol II, Theorem 8.1) in order to verify Condition KR. The following 
condition is adapted to Perron's theorem when S is denumerable. 

Condition P. Let S be a denumerable set, let P £ PM(S x S) and let 
Ai = {M(w) : w £ W} be a partition of P. Lf there exist an integer N and an 
element w N £ W n , such that the matrix M(w N ) is such that 

1) the set 5 2 (M(w N )) is finite, 
2) the set Sq defined by 

So = {i £ S : (M(w N )),, 4 > 0} (105) 

is nonempty, 
3) 

iM:w n ;l, > if i £ Sq, and j £ S Q 

4) 

(M(w N ))„- =0 if i£ 5 2 (M(w N )) \ So, 
then we say that Condition P is satisfied. 

Proposition 9.2 // Condition P is satisfied then Condition Bl is also satisfied. 
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Proof. Our arguments will be similar to those given by Kochman and Reeds 
in section 5, in their proof of Theorem 2 in [T§] . 

It is not difficult to convince oneself that it is no loss of generality to assume 
that S is a finite or infinite set consisting of consecutive positive integers starting 
with the number 1. It is also clear that we can assume that the set So defined 
by (fTUS"]) is such that 

S = {l,2,...,d} 

where d is a positive integer, and that the set S 2 (M(w N )) is such that 

S 2 (M N (w N )) = {l,2,...,L} 

where L is a positive integer > d. 
Next let us write 

G = M(w N ), 
A = {(M(w N )) 2J :zeS , jeS }, 
B = {(M(w N )) M - : i e S , j e S 2 (M(w N )) \ S }, 



and 



C = {(M(w N )) 2J :ieS\ S 2 (M(w N )), j e S } 



D = {(M(w N )) y : i E S \ S 2 (M(w N )), j E S 2 (M(w N )) \ S }. 




We can then write 

G = 



where each denotes a zero- matrix of appropriate format, and where the — 
matrix in the first column has the same number of rows as the number of 
columns in B. (In case S' 2 (M(w N )) = S the third row and the third column 
are omitted.) 

By induction it is straight forward to prove that 




G n = A n ( ABO ) 



if n > 2, 

Since Condition P is satisfied, it follows that A is a finite-dimensional square 
matrix with strictly positive elements. Therefore, by Perron's theorem (see e.g 
[T5] , vol II, Theorem 8.1), it follows that there exist a number A > and a 
rank-1 matrix A with strictly positive elements, such that 

lim \\A n /X n - A\\ =0. (106) 

n — >oo 

where the norm 1 1 • 1 1 is for example defined by |7|) . 
Next let us define 




W a = (1/A 2 ) \A( ABO ) 
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Since the elements of the matrices C and B are uniformly bounded and Si{Wo) 
is a finite set ||Wo|| exists and clearly ||Wo|| > 0. By using (|106p it is elementary 
to prove that for alH £ S 

Urn \\e i G n /X n - e l W \\ = 

n — >oo 

and also that 

lim ||G"VA' l || - llWoll =0. (107) 

n— *oo 

and therefore by defining 

W = Wo/\\W \\ 

it follows that 

lim \\e l G n /\\G n \\ - eW|| = (108) 

n — >oc 

for all i £ S and thereby we have verified Condition Bl. □ 

Next let us introduce the terminology subrectangular matrix, a notion intro- 
duced in [16] in a more special setting. 

Definition 9.1 Let M = {(M)j, j : i £ S,j £ S} be a matrix such that if 
(M) hA ^ and also (M) l2j2 ^ 

then also 

{M) iuh ^0 and (M) 12J1 ^ 0, 
then we call M a subrectangular matrix. 

Our next result is a generalization of Theorem A of [16) from the case when 
S is finite to the case when S is denumerable. Recall that G'(S) is the set 
of partitions for which each associated tr.pr.m P is irreducible, aperiodic and 
positively recurrent. 

Proposition 9.3 Let S be a denumerable set and let M. = \M(w) : w £ W} £ 
Q'{S) and suppose also that 

1) there exist an integer N\ and an element a Nl in YV Nl , such that M(a Nl ) is 
a non-zero subrectangular matrix, 

2) there exist an integer N 2 and an elementbF 2 in W*, such that iS2(M(b N2 )) 
is a finite set. 

Lt then follows that Pm * s asymptotically stable. 

Proof. From Proposition 9.2, Proposition 9.1 and Theorem 1.1 it follows that 
it suffices to prove that Condition P is satisfied. 

In order to do this we have to find an integer N and an element w N = 
(w\,W2, ...,itfjv) £ W N such that the matrix M 7V (w N ) satisfies the hypotheses 
of Condition P. 

First let j € S 2 {M(b 1 )M{b 2 )...M{b N2 )) and m £ 5i(M(b Na )) be such that 
(M(b N »)) m , j0 >0. 

Next let k £ S , 2(]V[(a Nl )). Since P is irreducible there exist an integer .ZV3 
and an element c Na £ W" 3 such that (M(c N3 )) fe:I „ > 0. 

Next let i £ <Si(M(a Nl )). Since M(a Nl ) is subrectangular (M(a Nl )) lifc > 0. 

Again using the fact that P is irreducible it follows that we can find an 
integer 7V 4 and an element d N * £ W Ni such that (M(d N4 )) JO , 4 > 0. 
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Now set N = iVi + N 2 + N 3 + N 4 , define the element w N <E W N by 
w N = (d N4 ,a Nl ,c Na ,b N2 ) 

and set 

G = M(w N ) = M(d N4 )M(a Nl )M(c N3 )M(b N2 ). 

That 52(G) is a finite set follows from the fact that S2(M(b N2 )) is a finite 
set, and hence hypothesis 1) of Condition P is fulfilled. By the construction 
of G it is clear that {G) jotjo > since (M(d N4 )) JOj4 > 0, (M(a Nl ))^ fc > 0, 
(M(c N3 )) fe , m > and (M(b N2 )) mj0 > 0, and hence hypothesis 2) is fulfilled. 

Furthermore if we have three nonnegative square matrices A\, A 2 and A3 of 
the same format such that 

A x = A 2 A 3 

and either A 2 or A3 - or both - are subrectangular, it is easily proved that A\ 
is also subrectangular. Therefore G is a subrectangular matrix since M(a Nl ) 
is subrectangular. Defining So — {i : > 0} and using the fact that G is 

subrectangular it follows that 

(G)ij > if i e S and j E S 

and hence hypothesis 3) is fulfilled. 

It remains to show that (G)ij = if i € S 2 (G) \ So, and j e S 2 (G). Let us 
assume that (G) ld > for i € S 2 (G) \ S and j e S 2 (G). Since i e S 2 {G) \ S 
there exists an element i' such that G(i',i) > 0. Since (G)j j > 0, (G)i.j > 0, 
{G)i',i > and G is a subrectangular matrix, it follows that both (G)( ij0 ) > 
and (G)j .i > 0. Again using the fact that the matrix G is subrectangular it 
follows that (G)i.i > 0. This however contradicts the fact that i £ So- Hence 
hypothesis 4) of Condition P is fulfilled and hence Condition P is satisfied. □ 

10 A random walk example. 

One drawback with Condition P introduced in the previous section is hypothesis 
1), which requires that the set ^(M^w 1,1 )) is a finite set. 

In this section we shall look at an example in which we verify that Condition 
B holds although the set 5 < 2(M(w N )) is infinite for every positive integer N and 
every w N e W N . 

Let S = {..., -2, -1,0, 1, 2, ...}, let S od d = {i € S, i odd} and S even = {i£ 
S, i even}. For i S S, let a,, bi and c» be positive numbers satisfying 

»j + h + Ci = 1. 

Let P e PM (S x S) be defined such that 

(P)i,i = 6o,VteS, 

(P) M+1 - (P)_ i; _ ( i +1) = cj, Vz > 

and 

(P)i,i-i = (P)- t ^-i) = a u Vz > 1. 

Let M = {M(1),M(2)} be a partition of P such that if i is odd then the 
i : th column of P is equal to the i : th column of M(l) and if i is even then the 
i : th column of P is equal to the i : th column of M(2). 
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Theorem 10.1 Let the tr.p.m P and the partition M be defined as above, and 
suppose also that 

oo n 

J2n c i-l/ a i < OO- ( 109 ) 

n— 1 i—1 

A) If hi = bo, Vi £ S then T?m *s asymptotically stable. 

B) If there exists iq S S such that bi > sup{6i : i £ S, i ^ io} then Pm * s 
asymptotically stable. 

Proof. From the definition of P it is clear that P is aperiodic and irreducible. 
That P is positively recurrent follows from (jl09() . ( see e.g [23] . Problem 18, 
chapter 2). We let ir denote the unique probability vector such that itP = n. 

We first assume that hi — bo, Vz £ S and shall prove the assertion of the 
theorem by verifying that Condition B holds. 

Set 1 — bo = a. Clearly ai + Cj = a, and b t = 1 — a, Vi £ S. In this case 
the set K of probability vectors on S is defined by 

oo 

K = {x= {{x)i, -oo < i < oo) : (at), > 0, }^(x)j = 1}. 

— oo 

To each x £ K we associate two other vectors which we denote by x 1 and 
x 2 , and which we define by 

— i £ Sodd, = 0, i £ S even 

{x 2 ) l = (x)i, i £ S even , (x 2 )i = 0, i £ S od d- (HO) 

Clearly 

x = x 1 + x 2 . 

It is easily seen that if i £ S odd then ||e l Af(l)|| = b = 1 - a, ||e l M(2)|| = a 
and that if i £ Seven then | |e 4 M(l)| | = a, \ \e l M(2)\ \ = b = 1 - a from which 
follows that 

\\xM(l)\\ = \\x 1 \\(l-a) + \\x 2 \\a, x £ K (111) 

and 

||a?M(2)|| = H^Ha + ||a; 2 ||(l - a), x £ K. (112) 

Next, let M = M(1)M(2). Let us first note that if x £ K is such that 
(x)i = 0, i £ S dd which thus implies that x ~ x 2 , then from (| 1 1 1|) and (I112p it 
follows that 

||a;M(l)M(2)|| = ||a;Af(l)||[xAf(l)]M(2)|| = a ■ a = a 2 (113) 

since S 2 (M(l)) = S odd - 

We shall next compute (M)ij when i £ S eV en- For i £ Seven and i > 2 we 
find that 

(M) M = (M)_i j _i = (M(l)) M+1 (M(2)) i+1 , i + (M(l)) iji _i(M(2)) i _i li = 

Ci ■ Oi+l + Oi ■ Ci-i. 

Furthermore, we find that 

{M\ l+2 = (M)_ i ,_ i _ 2 = (M(l)) <lif i(M(2)) i+1)i+2 + (Af(l)) M _i(M(2)) < _i )< = 
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and that 

(M) M _ 2 = (M)_ i ,_ i+2 = (M(l)) M _ 1 (M(2)) i _ 1>i _ 2 = 

0>i • 0>i— 1- 

For i G S eveni i ^ and k ^ i, k =/= i + 2, k ^ i — 2 it is clear that we have 
(M) <lfe =0. For i = we find 

(M) ,o = (a/2)oi + (a/2)oi = aoi; (Af) 0)2 = (M) ,-2 = (a/2)c x ; 

and 

(M)o,j = otherwise. 

Since 

^ (AT)*,*. = lle.Mll = a 2 , Vi € S even 

fc£S„ en 

because of ()113p . we can conclude that the matrix (l/a 2 )M can be considered 
as a tr.pr.m on S even . 

We now define the matrices A — {(A)ij,i E S,j £ S} and B = {(B)ij,i E 
S,jeS}by 

even ; 3 e S, 
(A)ij — 0. otherwise, 

and 

{B)ij = (M) iy j — (A) it j, ieS, jeS. 

Evidently M = A + B. Since S 2 (A) C S even , Si(B) C S odd and S 2 (B) C S el , e „ 
it is evident that 

(A + B) ■ (A + B) = A 2 + BA = MA 

and more generally that for n = 2,3, .... 

M n = {A+ B) n = A n + BA n - 1 = MA"- 1 . (114) 

Since A/ a 2 can be considered as a tr.p.m on S even , the Markov chain gen- 
erated by A/a 2 is an irreducible, aperiodic random walk on S even and since 

oc n 

5]n( M )2-2( l+ l)/(^)2( l+ l),2 l = 

n=0 i=0 

oo n oo 2n+l 

^^Qc 2 iC2 l+ i/(a 2l+ 2a2j+i) = II c i/( a *+i) < 00 

n=0 i=0 n=0 i=0 

because of (|109[) . it follows that this random walk is positively recurrent and 
therefore there exists a probability vector q on S such that 

[q)i = 0, if i G £ 0<w 

and 

gA/( a 2 ) = g. 
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Next, let p > be given. In order to verify that Condition B is satisfied we 
have to show that there exists an element iq G S, such that if the compact set 
C is such that 

fi(CnE io ((ir) io /2))>(Tr) io /3, V p G 7>(Jr|7r), (115) 
then we can find an integer TV and an element w N G YV N such that 

to G Si(M(w N )) (116) 

and 

||[zM(w N )]-[e*°M(w N ]|| <p, V.eCn£ !0 (W tl /2)). 

We choose io = 0, and we let Co C if be a compact set such that (|115p is 
satisfied with C replaced by Co- Set 

C' = {ffiAf(l)/||a:M(l)|| : x G C }. 

Since ||a;M(l)|| > min{a, 1 — a}, \/x G K, it follows easily that C" is also a 
compact set. From the general theory on Markov chains (see e.g [21], chapter 
2), it therefore also follows that we can find an integer N\ such that if n > N% 
then 

\\y(A/a 2 r-q\\ < p/2, if y G C. (117) 

We now choose the integer TV = 2(iVi + 1) and choose the sequence 
Wi, i = 1, 2, N such that Wi = 1, if i is odd and Wi — 2, if i is even. Then 

M(w Ar ) = M(1)M(2)M(1)M(2)...M(1)M(2) = M ffl+1 , 

where as before M = M(1)M(2). Clearly i G S'i(Af Ari+1 ) if i = 0. 

Next, let i e Co be chosen arbitrarily, and set z\ = [xM] and zq — [e°M]. 
Using the fact that ||yM|| = a 2 , Vy G K' such that y = y 2 , where y 2 is defined 
by (|110p . we now find, by using <\ll'd\i . (|114j) . ()1 1711 and the scaling property 
([2"5|l (see Lemma 2.1) that 

||[xM Wl + 1 ]-[e°^ Ari+1 ]ll = 

||M^]-[ 2o ^]|| = 

\\{ Zl A^/(a 2 ^)- Za A^/(a 2 ^))\\< 

\\ Zl A^/(a 2 ^) - gH + Hzo^V^)) - g||<p 

and since x G Co was chosen arbitrarily, we can conclude that Condition B is 
satisfied. Since the tr.pr.m P is irreducible, aperiodic, and positively recurrent 
the conclusion of the theorem follows from Theorem 1.1. 

It remains to consider the case when there exists io € S such that bi > 
swp{bi : i G S, i ^ iq}. Let us first assume that io G S oc id- Define the matrix 
D = {(D) itj :ieS,jeS} by 

(D)i,i = (M(l))i,i 
(D) itj 0. // - ' j. 

Clearly \\D\ \ = 6j , and 

lim D n /\\D n \\ = e ia \ ia 
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where e l ° denotes the column vector obtained by taking the transpose of e l °. 
Next let us note that 

M(l) n = M(l)L)™- 1 (118) 

for n > 2. Set 

Wi = e io V 

and 

W = M{l)Wi/\\M{l)W x \\. 
Using (|118p it is now easy to verify that for all i G S 

lim ||e < M(l) n /||Af(l) n || - eW|| = 

n — >oo 

and thereby we have verified that Condition Bl is satisfied and therefore Con- 
dition B is also satisfied, and since the tr.pr.m P is irreducible, aperiodic and 
positively recurrent, the conclusion again follows from Theorem 1.1. 

If instead bi > sup{6j : i G S, i ^ iq} and io G S even we can argue in a 
similar way using the matrix M(2) instead of M(l). □ 

11 Convex functions and barycenters. 

As usual, let S be a denumerable set and let K denote the set of probability 
vectors on S. As was mentioned in the introduction (see subsection 1.3) the set 
if is a convex set. For the definitions of the set C convex [K] and the measure tp x 
see subsection 1.8. 

Following Choquet (see e.g [6]) we now make the following definition. 

Definition 11.1 (Compare e.g. JS^, Definition 26.6.) Let q G K and let /i G 
V{K\q) and also v G V{K \q). We say that /i is more diffuse than v and write 
[i^v and v ~< \x if 

U, V ) , Vli G Cconvex \K\ ■ 

Next, for each q G K we define the subset Vd{K\q) of V(K\q) as follows: 

Definition 11.2 The subset Vd{K\q) ofV(K\q) consists of all measures v G 
V(K\q) such that v = Y^kLi a kO~x k where {xu, k = 1,2,...} is a sequence of 
elements in K and {ctk,k — 1,2,...} is a sequence of non-negative numbers 
satisfying J2kLi a k = l- 

Proposition 11.1 Let q G K and let fi G Vd{K\q). Then fj, is always more 
diffuse than 5 q and less diffuse than ip q which symbolically can be expressed as 

5 q d A 1 r< ipq- 

Proof. Let q G K and let fi G Vd(K\q) be such that 

oo 

fj, = ^ a k S Xk 

k=l 

where {xk,k = 1,2,...} is a set of elements in K and {ak,k = 1,2,...} is a 
sequence of non-negative numbers satisfying Y^k=i ak = ^' Since [i G V(K\q) it 
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follows that b(n) = q and hence a kX k = <Z- Therefore, if u G C co „„ ex [AT] , 

it follows that 

oo oo 

(u, n) = ^ a ku{x k ) > u(^2 a k x k ) = u(q) = (u, S q ) 

fc=l fe=l 

which implies that \i > S q , and it also follows that 

CO oo 

(u,fi) = ^a k u(x k ) < ^a k (^(x k ) l u(e t )) = 

k=l fe=l i=l 

oo oo oo 

U ( e *) a k(x k )i = Y u ( el )<l* = ( U > V'g) 
j = l fe=l i=l 

which implies that /it ^ Vg- ^ 

Next recall that a continuous bounded convex function u belongs to C' convex [K] 
if the function u can be obtained as 

u = sup{w„ : n G TV} 

where N is an arbitrary index set, and each v n is an affine function on K such 
that 

v n (x) = xa c n + b n 
where thus a n € ^(S) and each 6„ is a real number. 

Proposition 11.2 Let S be a denumerable set and let M G G{S). Then u G 

C conve x[K] G C convex [A] . 

Proof. Let it G C convex [K\. By definition this also implies that u G C[-?T]. 
Since the tr.pr.f P^vi induced by M is Lipschitz continuous, the space (K, £) is 
a complete, separable, metric space and the set Lip[K] is measure determining, 
it is easily proved, that Tj^u G C[K] (the Feller property). 

Next let x,y G K be two arbitrary points, let < A < 1 and set z = 
\x + (1 — X)y. To prove that Tm u <= C convex [K ] we have to prove that 

T M u{z) < \T M u(x) + (1 - \)T M u{y). (119) 

Recall that the set W M (Q for £ G if is defined by W M (0 = {w G W : 
IKMMH >0}. 

Next, let Zi(S) = {x = (x) l ,i G S) : J^ies \( x )*\ < °°}- For £ e M 5 ) and 
a G l°°(S), we write a(£) = £a c . 

Now clearly, if w : Zi(S) -> R is defined by = a(£) + 6 where a G /°°(5) 
and b G R, and x\ € K and £2 G if, then if ai > 0, a2 > and ai + 0:2 = 1, it 
follows that 

w(aixi + a 2 x 2 ) = aiv(xi) + a2v(x 2 ) (120) 

and 

a(xi +x 2 ) = a(xi) +a(x 2 ). (121) 
Furthermore, if u G C' convex [K], then u can be written 

u = sup{u„ : n G A/"} 
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where Af is an arbitrary index set and each v n is an afRne function on K such 
that 

v n (x) = xa c n + b n 

where thus a n G l°°(S) and b n is a real number. 
We now find by using (fT2"0|l and (fT2"Tj) that 

T M u{z) = \\ zM ( w )\\ SU P v n ([zM(w)}) = 

J2 \\*M(w)\\ sup((zM(w)/\\zM{w)\\)a c n + b n ) = 
wew M (z) neAr 

sup(zM(w)a c n + b n \\zM{w)\\) = 
sup(AxMH< l +6„A||a;MH|| + (l-A)yMH< l + (l-A)6„|| 2 ;MH||) < 
^2 snp(XxM[w)a c n + b n X\\xM(w)\\)+ 
sup((l - %MW< + 6 n (l - A)||yMH||) = 



Y sup (AzM (w)a£ + b n \\ \xM(w) 



^ sup((l-A)yAfH< l + 6„(l-A)|| 2 ;MH||) = 

ATmm(j:) + (1 - X)T M u(y) 
and hence (|119jl holds, which was what we wanted to prove. □ 

Proposition 11.3 Let S be a denumerable set, let M.\,M.2 G Q{S) and let 
Pi, P% be the tr.pr.ms associated to M.\ and M.i respectively. Let q G K and 
suppose that 

q = qPi = qP 2 - 

Then, if u G C' convex [K] 

(u,S g ) < (u,P M2 S q ) < (u, P M2 P Ml S q ) < (u,P M2 P Ml ip q ) < (u,P M2 tp q ) < (u,t/} q ). 

Proof. From Theorem 5.2 follows that PM 2 S q S V(K\q). Since also pM 2 $q £ 
^(i^lg) the first inequality now follows from Proposition 11.1. 
To prove the second inequality, we first use (fTTj) to obtain 

(u,P M2 PM 1 S q ) = (T M2 u, Pm 1 S q ) 

and then, using the fact that Tm 2 u G C conVfix [K] if u G C convex [K] and the fact 
that pMi^q G ^(-^k)) it follows from Proposition 11.1 that 

(T M2 u, P Ml S q ) > (T M2 u,5 q ) = (u,P M2 6 q ). 
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Hence 

and thereby the second inequality is proved. 

To prove the third inequality we first use a) of Theorem 2.1 to obtain 

(it, P M2 PM 1 Sq) = (^-P-MiMA)- 

Since also PmiM 2 G Pd{K\q), it follows from Proposition 11.2 that if u S 
C' convex [K], then Tm 1 m 2 u G C convex [K] and hence by Proposition 11.1 and 
(fTTj) . it follows that 

(u, PM 2 PM!& q ) = (TM 1 M 2 u,$ q ) < (TM 1 M 2 u,'>Pq) = (u, PmxM.2%) ■• 
From a) of Theorem 2.1 and the preceding inequality it now follows that 

(U, P M2 PM 1 Sq) < (u,P Ml M 2 1pq) = (U, PM 2 PM 1 ^q) 

and thereby the third inequality is proved. 

Next, using (fTTj) . Proposition 11.2 and Proposition 11.1 again, we find that 

(u,P M2 PM 1 l/-'q) = 
(T M2 U, P Ml 1pq) < (TM 2 U,tpq) = (u,P M2 1pq) 

which proves the fourth inequality, and the last inequality follows again from 
Proposition 11.1 and the fact that Pmi^i S Pd(K\q). □ 
As an immediate corollary we obtain 

Proposition 11.4 Let S be a denumerable space, let P € PM(SxS), let q E K 
satisfy q = qP, let M. be a partition of P and let u E C" convex [K] . Then, for 
n = l,2,..., 

(u,P^S q ) < (U,PZ+X) < (u,P^q) < (u,P^ q ). 

Proof. Follows from Proposition 11.3 and Corollary 2.1. □. 



12 A martingale. 

As usual, let S be a denumerable set and let K denote the set of probability 
vectors on S. Let P E PM (S x S) and suppose that n E K is such that it = irP. 
Let M = {M(w) : w E W} be a partition of P, and let {Z n (ir),n = 0, 1, 2, ...} 
denote the sequence of stochastic variables with values in (K, £) generated by 
Pm an d the initial distribution 5^. 

Next let A denote the discrete a — algebra on W. For n = 1, 2, ... let as 
before W n = Ui=i w * where w * = = 1, 2, n and let .A" denote the 
discrete a — algebra on W n . 

Now define a± on (W,A) by 

ai(B)= ^ \\nM(w)\\, Be A, 

and, for n — 2,3, define a n on (W n ,A n ) by 

a n (B n ) = ^ \\TTM(w n )M(w n - 1 )...M(w 1 )\\, B n E A n . 

w"£B" 
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Lemma 12.1 For n = 1,2, ... the set function a n is a probability measure on 
(W n ,A n ). Furthermore a n+1 is an extension of a n . 

Proof. First note that 

^ \\nM(w n+1 )M(w n )M(w n - 1 )...M(w 1 )\\ = 

(wi ,to 2 ,...,to„ + i)GW" + 1 

\\nPM(w n )M(w n - 1 )...M(w 1 )\\ = 

]T \\nM(w n )...M(wi)\\, 

since X^ew M(w) — P and nP — tt. By induction follows that 

\\TrM(w n+1 )M(w n )...M( Wl )\\ = 

]T 11^(^)11 = 11^11 = 1^11 = 1 

wiew 

and since also ||7rM(tz; n )M(w n _i)...M(«;i)|| > it follows that each a n is a 
probability measure on (W n ,A n ) respectively. That a n +i is an extension of a n 
for each n follows from the fact that 

J2 \\TrM(w n+1 )M(w n )M(w n - 1 )...M(w 1 )\\ = 

\\nPM(w n )M(w n - 1 )...M(w 1 )\\ = 

||7rM(«; n )...M(«;i)||. □ 

Next, let W 00 = n,~i W t where as above W l = W, and let A°° be the least 
(j — algebra containing all sets of the form B n x JlSn+i w here n — 1,2, ... 
and £>„ € .4™. Let a be the extension of the sequence {a n ,n = 1,2,...} of 
probability measures on (W n ,A n ), n = 1,2,..., to the space (W°°,A°°). We 
denote an element in W°° by w and write 

w= (w n , n= 1,2,...) 

For n = 1,2,... define Y n : W°° -» W by F^(w) = iu n , and define : 
W°° -» X by 

Z;(«J) = [irM(w n )M{w n - 1 )...M(w 1 )] 

if ||7rM(«;„)M(w rl _i)...M('u;i)|| > and Z' n (w) = n otherwise. 

That Y^, n = 1,2, ... are stochastic variables on the probability space 
(a,W°° , A°°) with values in W is obvious, and that also Z^, n = 1,2,... are 
stochastic variables on (a, W°°, A°°) with values in K is not too difficult to prove 
if one uses arguments similar to those that are needed in order to prove that 
the tr.pr.f P>f(-, ■), induced by a partition, is measurable, in the first variable. 
(See Proposition 14.1.) 

From the definition of the measures a n , n = 1,2... and the definition of 
Y r [, n = 1, 2, .. it is obvious that the following equality is true, which we state 
without further proof. 
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Lemma 12.2 For n=l,2,... 

Pr[Y( = w n X = Wn-i, -X =wi] = \\nM( Wl )M(w 2 )...M(w n )\\. 
From Lemma 12.2 the following equality follows. 
Lemma 12.3 For n = 1, 2, 

Pr[Z' n e B] = Pr[Z n (n) £ B], VB £ £. 

Proof. Set 

W* n (B) ={(w lt W2,...,w n ) £ W n : [irM(w n )M(w n -i)...M(wi)] £ B}. 
Then, we first note that 

Pr[Z' n £ B] = a n (w 1 ,w 2 ,...,w n ) = 

(wi,w 2 ,....w n )eW' n (B) 

\\nM(w n )M(w n - 1 )...M(w 1 )\\. 

(aii,«J 2 ,...,io„)eW*"(B) 

Next considering Z n (ir) it is clear from the definition of Z n (ir) that 

Pr[Z n (Tr) £ B] = Yl \\7rM(w 1 )M(w 2 )...M(w n )\\ 

(wi ,ii>2...,ii>„)GWjv(»i (j*,B) 

where Wm™ l 7 *", B) is defined by 

Wm^B) = {( Wl ,w 2 ,...,w n ) £ W n : \\nM( Wl )M(w 2 )...M(w n )\\ >0 

and [nM{w 1 )M(w 2 )...M(w n )} £ B }, B £ £ . 

But if (wi,w 2 , ...w n ) £ Wm™{k,B) then clearly (w n ,w n -i, ...,wi) £ W* n (B) 
from which follows that 

Pr[Z n (n) £ B] = Yl \\7rM(w n )M(w n - 1 )...M(w 1 )\\ = Pr[Z' n £ B}. □ 

(w n ,w„-i,...,wi)eW n (B) 

The following martingale relation also follows easily. 
Proposition 12.1 For n = 1,2, 3, ... and i £ S 

E[(Z' n+1 (W)) i \A n ] = (Z' n (W)) i 
Proof. First note that the set 

= {(w u w 2 ,...w n ) : \\nM(w n )M(w n - 1 )...M(w 1 )\\ > 0} 
is such that a n (W* n ' + ) = 1. Next note that if 

K, W2 ,..., W „) e w*"<+ 

then 

PrK +1 = «, | Y[ = w u Y> = W2, -X = «>n] = 
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\\nM(w)M(w n )M(w n - 1 )...M(w 1 )\\/\\nM(w n )M(w n ^ 1 )...M(w 1 )\\. 
Hence if (wi,w 2 , ■■■w n ) E W* n - + 

E[(Z' n+1 (w)) t | Y{ = Wl ,Yi = w 2 , ...X = W n] = 

{^M{w n+1 )M{w n )M{w n - l )..M{w l )) i /\\TTM{w n+1 )qM 

aj„ + l£W 

\\nM(w n+1 )qM(w n )M(w n ^ 1 )...M(w 1 )\\/\\wM(w n )M(w n - 1 )..M(w 1 )\\ = 
J2 (TrM(w n+1 )M{w n )M(w n - 1 )..M(w l )) i /\\wM(w n )M(w n - 1 )..M{w 1 )\\ = 

(it ■ P ■ M(w n )M(w n -i)...M(wi)) i /\\TrM(w n )M(w n ^ 1 )...M(wi)\\) = 
(TrM(w n )M(w r ^ 1 )...M(w 1 )) l /\\nM(w n )M(w r ^ 1 )...M(w 1 )\\) = (Z'Jw)),. □ 

Corollary 12.1 The sequence {Z' n , n = 1,2, ...} converges almost surely. 

Proof. Since, for each i E S, {(Z' n )i,n = 1,2,...} is a sequence of uniformly 
bounded, stochastic variables, it follows from Proposition 12.1 and the martin- 
gale convergence theorem (see e.g. [H]), that {(Z^)i, n — 1,2,...} converges 
almost surely for each i E S. Since S is a denumerable set, it then follows that 
also {Z' n , n = 1,2, ...} converges almost surely. □ 



13 Blackwell's entropy formula. 

Let h : [0, 1] -> [0, l/(e ■ ln(2))] be defined by 

h{t) = -tln(t)/ln(2), if < t < 1 and h(0) = 0. 

In this section we shall define the entropy and the entropy rate for a partition in 
Q ' Z (S), (see section 2 for the definitions of Q ,Z (S)), and shall prove that the 
formula obtained by Blackwell for the process {Y n , ; oo < n < oo} described in 
the introduction (see subsection 1.7) also holds under a more general situation. 

Definition 13.1 Let S be a denumerable set, let A4 = {M(w) : w E W} E 
Q ' Z (S), let P E PM(S x S) be the associated tr.pr.m and let it E K + be such 
that 7rP = 7r. 

We define the entropy, H(A4), of the partition M. E Q ' (S), by 
H(M)= £ h(\\wM(w)\\), 

and we define the entropy rate Hr(M.) of the partition M EQ ' Z (S)) by 
H R (M) = lim (H{M n+1 ) - H(M n )). 

n — >oo 

That H(M.) is well-defined is clear since we assume that M E Q ' Z (S). That 
also Hfj(Ai) is well-defined follows from the following theorem. Recall that S q 
is defined by S q ({q}) = 1 and that tp q is defined by ip q {{e 1 }) = (q)i, i E S. The 
following theorem is similar to Theorem 4.4.1 of [7j. 
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Theorem 13.1 Let S be a denumerable set, let it 6 K + , and let M. — {M(w) 



weW}eg„(S)ng'' z (S). 



Then 
a) for n = 1, 2, ... 

E J h(\\yM(w)\\)P}UMdy) < E / h(\\yM(w)\\)P^ +1 Mdy)< 



w£W K w£M JK 



E /„MllyMH||)i*+X(dy)< E l^ h (\\y M ( w )\\)PM^( d y)- 



H R (M)= lim E / ^(||yMH||)PM»5 w (%) (122) 
mew K 

Proof. For each w £ W define : if — ► R by 

5w ( 2; ) = / l (||xMH||) 
It is easily seen that if we define / : [0, 1] — » R by 

f(t) = M{h(n) + ti(n)(t - k) : < K < 1} 

then 

/(*) = &(*)■ 

Hence 

9w{x) = f(\\xM(w)\\) = m£{h(K) + ti(K)(\\xM(w)\\ - n) : < k < 1}. 

Since the mapping p : K — > [0, 1] defined by p{x) — \ \xM{w)\ \ is a linear function 
on K it clearly follows that for k such that < k < 1 the mapping from K to 
R defined by 

h{n) + ti{n){\\xM{w)\\ -k) 

is an affine mapping. Therefore the function f w {x) defined by f w (x) = —g w {x) 
belongs to C' convex [K] for each w G W. We can thus apply Corollary 11.1 to 
f w , and find that for each w £ W and n — 1,2, ... 

and by multiplying by (-1) and adding over w we obtain the conclusion of part 
a) of the theorem. 

Next by using the inequalities of part a) we conclude that 

{E / h(\\yM(w)\\)P Mn S w (dy), n = l,2,...} 

is a decreasing sequence bounded from below and hence the right hand side 
of (|122p exists. From part b) of Theorem 2.1 follows that H(M n ) exists for 
n = 1, 2, .... and that 

H(M n+1 ) - H{M n ) = E / H\\yM{w)\\)P M ^{dy). 



08 



Hence 

lim (H(M n+1 )-H(M n )) =H R {M) = lim V [ h(\\yM(w)\\)P Mn S w (dy), 

weW JK 

and thereby part b) of the theorem is proved. □ 

The entropy formula in the next theorem originates from the paper [1] by 
Blackwell from 1957. 



Theorem 13.2 Let S be a denumerable set, let tt £ K + , and let M. — {M(w) : 
w € W} € Gtt{S) n Q ,z {$). Assume also that the tr.pr.f P m is asymptotically 
stable, and let \jl denote the unique stationary measure of Pm ■ 
Then, for each x S K 

H R (M)= lim V h(\\yM{w)\\)P M »6 x (dy) = 

J2 I h(\\yM(w)\\Mdy). 
weM Jk 

Proof. Since 1) ||?/M(w)|| is a continuous function of y for each w £ W, 2) 
the function h is a continuous function and 3) Pm is asymptotically stable, it 
follows for each x S K - and in particular for x = -k - that 

lim f h(\\yM(w)\\)P n M (x,dy)= lim f h(\\yM(w)\\)P M «6 x (dy) = 



[ h(\\yM(w)\\Mdy) = 

J K 

for each w G W. Therefore, since all integrands are positive, we can conclude 
by using b) of Theorem 13.1 that for all x € K 

H R (M)= lim V f h{\\yM{w)\\)PM^ J! {dy) = 

n— s-oo z — » / 7.-- 

hm V / ft(||yM(«;)||)iWx(dl/) = V / h{\\yM{w)\\Mdy). □ 



14 is a transition probability function. 

Theorem 14.1 Let S be a denumerable set, and let M. — {M(w) : w S W} 
be a partition of P € PM(S x S 1 ). Define the function Pm : K X £ [0,1] by 

P M {x,B)= 11^0)11' x£K, Be£ 

weW M {x,B) 

where 

W M {x,B) = {weW: \\xM(w)\\ > 0, xM(w)/\\xM(w)\\ £ B}. 
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Then, for every x G K, PjVi(x, •) is a probability measure on (K,£), and also, 
for every B G £. 

Pm(;B) 

is a measurable function. Hence Pm '■ K X £ — > [0, 1] is a transition probability 
function according to the usual definition in probability theory. (See e.g. i25}j . 
Definition 1.8.) 

Proof. To prove that Pjn(av) is a probability measure for each x G K 
let us first note that ~P M {x,K) = J2 W I \xM(w)\ \ = \\xP\\ — 1. Furthermore 
if B\,B-z G £ are such that B\ n -B2 = then clearly also VVm(:e, Bi) (~1 
W J M(a;,-Bi) = and more generally if {B n , n= 1, 2, ...} is a sequence of disjoint 
sets then clearly {Wm( x > B n ), n = 1,2, ...} is also a sequence of disjoint sets, 
from which follows that if {B n , n = 1,2,...} is a sequence of disjoint sets then 

00 

P M {x,D™ =1 B n ) = Y / Pm(x,B, 1 ). 

n=l 

Hence Pm( x : ■) is a probability measure for each x G K. 

It thus remains to prove that if B G £ then P J vi(-,_B) is £ — measurable . 
We first prove the following proposition. 

Proposition 14.1 Let S be a denumerable set and let M denote a square, 
non-negative S x S matrix, such that < sup{||a;M|| : x G K} < 1. Let 
K = {x G K : ||xAf|| > 0}. For an arbitrary F C K we define Kp — {x G 
K 1 : xM/\\xM\\ G F}, and u F : K -> [0, 1] by 

u F (x) = \ \xM\\ if x G Kp 

and 

Uf{x) = if x £ Kp. 

Further let 

T = {F C K : uf is £ — measurable} . 

Then: 

a) If F is an open set then F G T . 

b) If F =F 1 n F 2 where F 1 G T and F 2 G T then F G T. 

c) If F = U{ i=ll 2,...}Fi and F t ^ T,i — 1,2, ... then F G T. 

d) If F = K\Fi and Fx G T then F G T. 

e) If B G £ then ub is £ — measurable. 

Proof of Proposition 14.1. For an arbitrary F C K define I K i : K — > {0, 1}, 

by 

I K i(x) = 1 if x G K 1 and xM/\\xM\ \ G F 
Ij^i (x) = otherwise. 
From the definition of up it is clear that we can express uf by 

u F {x) = \\xM\\I K i(x). 

Next suppose F Q = F 1 ^F 2 . Then I K i (x) = 1 if and only if x G K 1 and xM/\\xM\ \ G 
Fi n F 2 from which it follows that 

IkI ( x ) =min{7 A: i {x),I K i (x)}. 

Fq F x F 2 
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Hence if Fi E T, i = 1,2 and F = F\ HF 2 then we can rewrite uf (x) as follows: 
u Fo (x) = \\xM\\I K i (x) = ||xM||min{/ K i (x),I K i (x)} = 

Fq t 1 F 2 

min{| \xM\\I Klpi (x),\\xM\ \I K ^(*)} = 
min{u p 1 (x) ,up 2 (x)}. 

Since Fi E T, i = 1,2 it follows from the definition of F that uf ± and uf 2 
are 5 — measurable, and since the minimum of two measurable functions is 
also measurable it follows that uf is also £ — measurable and hence F E T . 
Thereby assertion b) of the proposition is proved. 

To prove assertion c) we assume that F — ^{i=i,2,...}F and that Fi E T ,i — 
1, 2, ... . Then we can write 

u Fa (x) = \\xM\\I K i (x) = \\xM\\I K i (x) = 

F U {i=l,2,...}- F i 

||xM|| sup I K i(x)} = 

{i=l,2,...} 

sup{||a;M||J A -i 4 (a;):* = l,2,...} = 

sup{ttF 4 (a;) : i = 1,2,...} 

which implies that iij7 is £ — measurable since if /2, ■■■} is a dcnumcrablc 
set of measurable functions then sup{/i, /2, ....} is also measurable. 
In order to prove assertion d) we can write 

u Fo (x) = \\xM\\I K i (x) = \\xM\\I K i (x) = 

\\xM\\ (I K1 (x) - I K i (x)) = \ \xM\\ - u Fl (x) 

and since both | \xM\ \ and uf 1 (x) are £ — measurable as functions of x it follows 
that uf (x) is £ — measurable, and hence Fq € T. 

From assertions b), c) and d) it follows that T is a a — algebra and since £ 
is the least a — algebra containing the open sets, we can conclude that part e) 
follows when we also have proved assertion a) of the proposition. 

The proof of proposition a) is fairly standard but somewhat tedious. Thus 
let us assume that F C K is an open set. For — oo < b < oo let 

K b = {x e K : u F (x) < b}. 

Furthermore define a by 

a = sup{||xM|| : x G K}. 

To prove that uf is measurable it suffices to prove that Kb € £ for all b € R. 
Since < Uf(x) < an, it is clear that Kb — if b < and Kb — K if b > an, 
and since both € £ and K £ £, it remains to show that Kb & £ for < b < a . 

We first consider the set K' = {x E K : up{x) = 0}. Now from the definition 
of Uf we find that the set Kq can be written 

K = (K\K')U(K'\K F ). 
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Since the function x — > | jariW| j is a continuous function it is clear that the set 
K 1 = {x E K : \\xM\\ > 0} belongs to £. Next consider the set Kp = {x E 
A 1 : xM/\\xM\\ E F}. Now if x E A" is such that ||x M|| > then, since 
the set B E £ is assumed to be open, there exists a number p > such that if 
\\y — xq\\ < p then y E A]j, from which follows that A]j is an open set. (That 
such a number p > exists follows from the inequality (|38|) since it implies that 

||(sM/||zM||- W M/||yM||)|| < 2(\\xM - yM\\)/\\xM\\ 

if both ||sgM|| > and ||yM|| > 0.) Therefore A, A 1 and Kp all belong to £ 
and hence K' Q E £. 

Next let a, 6 £ R be two numbers satisfying < a < b < ao but otherwise 
arbitrary. Define 

K a ,b — {x E K : a < up{x) < b}. 

Then again utilizing the assumption that the set F is an open set, we find that 
if x E K a b then a < up(y) < b if y E B(x,p) and p > is sufficiently small; 
hence K a ^ is an open set and therefore belongs to £ . 
Since when < b < ao, can be written 

K b = K U Kq U^° =1 K 1/nib 

and we have showed that each of these sets belongs to £ , it follows that Kb E £ 
when < b < a,Q and thereby we have proved that Kf, E £ for all b E R. Hence 
Up is £ — measurable, and the proof of Proposition 14.1 is completed. □ 

We now return to the proof of Theorem 14.1. Let B E £ be fixed but 
arbitrary. For w E W, define u w ^b '■ K — ► [0, 1] by 

u WtB (x) = \\xM(w)\\ if x E B 

and 

u Wi b(x) = if x £ K+ B . 

where 

K,B = {xEKl: xM(w)/\\xM(w)\\ E B} 

and where 

K 1 w = {xEK: ||xM(ti;)|| >0}. 
With these functions at our disposal we can write 

P M (x,B)= \\ xM ( w )\\ = 

wGW m (x,B) w£W 

But from e) of Proposition 2.5 follows that u Wj b is £ — measurable for each 
w E W. Since W is denumerable and u w> b > it follows that Pm{-,B) is 
£ — measurable, which was what we wanted to prove. □ 
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